Includes all pending changes needed for nixos-anywhere: - fiberlb: L7 policy, rule, certificate types - deployer: New service for cluster management - nix-nos: Generic network modules - Various service updates and fixes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
369 lines
11 KiB
Markdown
369 lines
11 KiB
Markdown
# T055.S3: BGP Integration Strategy Specification
|
|
|
|
**Author:** PeerA
|
|
**Date:** 2025-12-12
|
|
**Status:** DRAFT
|
|
|
|
## 1. Executive Summary
|
|
|
|
This document specifies the BGP Anycast integration strategy for FiberLB to enable VIP (Virtual IP) advertisement to upstream routers. The recommended approach is a **sidecar pattern** using GoBGP with gRPC API integration.
|
|
|
|
## 2. Background
|
|
|
|
### 2.1 Current State
|
|
- FiberLB binds listeners to `0.0.0.0:{port}` on each node
|
|
- LoadBalancer resources have `vip_address` field (currently unused for routing)
|
|
- No mechanism exists to advertise VIPs to physical network infrastructure
|
|
|
|
### 2.2 Requirements (from PROJECT.md Item 7)
|
|
- "BGP AnycastによるL2ロードバランシング" (BGP Anycast L2 LB)
|
|
- VIPs must be reachable from external networks
|
|
- Support for ECMP (Equal-Cost Multi-Path) across multiple FiberLB nodes
|
|
- Graceful withdrawal when load balancer is unhealthy/deleted
|
|
|
|
## 3. BGP Library Options Analysis
|
|
|
|
### 3.1 Option A: GoBGP Sidecar (RECOMMENDED)
|
|
|
|
**Description:** Run GoBGP as a sidecar container/process, control via gRPC API
|
|
|
|
| Aspect | Details |
|
|
|--------|---------|
|
|
| Language | Go |
|
|
| Maturity | Production-grade, widely deployed |
|
|
| API | gRPC with well-documented protobuf |
|
|
| Integration | FiberLB calls GoBGP gRPC to add/withdraw routes |
|
|
| Deployment | Separate process, co-located with FiberLB |
|
|
|
|
**Pros:**
|
|
- Battle-tested in production (Google, LINE, Yahoo Japan)
|
|
- Extensive BGP feature support (ECMP, BFD, RPKI)
|
|
- Clear separation of concerns
|
|
- Minimal code changes to FiberLB
|
|
|
|
**Cons:**
|
|
- External dependency (Go binary)
|
|
- Additional process management
|
|
- Network overhead for gRPC calls (minimal)
|
|
|
|
### 3.2 Option B: RustyBGP Sidecar
|
|
|
|
**Description:** Same sidecar pattern but using RustyBGP daemon
|
|
|
|
| Aspect | Details |
|
|
|--------|---------|
|
|
| Language | Rust |
|
|
| Maturity | Active development, less production deployment |
|
|
| API | GoBGP-compatible gRPC |
|
|
| Performance | Higher than GoBGP (multicore optimized) |
|
|
|
|
**Pros:**
|
|
- Rust ecosystem alignment
|
|
- Drop-in replacement for GoBGP (same API)
|
|
- Better performance in benchmarks
|
|
|
|
**Cons:**
|
|
- Less production history
|
|
- Smaller community
|
|
|
|
### 3.3 Option C: Embedded zettabgp
|
|
|
|
**Description:** Build custom BGP speaker using zettabgp library
|
|
|
|
| Aspect | Details |
|
|
|--------|---------|
|
|
| Language | Rust |
|
|
| Type | Parsing/composing library only |
|
|
| Integration | Embedded directly in FiberLB |
|
|
|
|
**Pros:**
|
|
- No external dependencies
|
|
- Full control over BGP behavior
|
|
- Single binary deployment
|
|
|
|
**Cons:**
|
|
- Significant implementation effort (FSM, timers, peer state)
|
|
- Risk of BGP protocol bugs
|
|
- Months of additional development
|
|
|
|
### 3.4 Option D: OVN Gateway Integration
|
|
|
|
**Description:** Leverage OVN's built-in BGP capabilities via OVN gateway router
|
|
|
|
| Aspect | Details |
|
|
|--------|---------|
|
|
| Dependency | Requires OVN deployment |
|
|
| Integration | FiberLB configures OVN via OVSDB |
|
|
|
|
**Pros:**
|
|
- No additional BGP daemon
|
|
- Integrated with SDN layer
|
|
|
|
**Cons:**
|
|
- Tightly couples to OVN
|
|
- Limited BGP feature set
|
|
- May not be deployed in all environments
|
|
|
|
## 4. Recommended Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ FiberLB Node │
|
|
│ │
|
|
│ ┌──────────────────┐ ┌──────────────────┐ │
|
|
│ │ │ gRPC │ │ │
|
|
│ │ FiberLB │───────>│ GoBGP │──── BGP ──│──> ToR Router
|
|
│ │ Server │ │ Daemon │ │
|
|
│ │ │ │ │ │
|
|
│ └──────────────────┘ └──────────────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌──────────────────┐ │
|
|
│ │ VIP Traffic │ │
|
|
│ │ (Data Plane) │ │
|
|
│ └──────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### 4.1 Components
|
|
|
|
1. **FiberLB Server** - Existing service, adds BGP client module
|
|
2. **GoBGP Daemon** - BGP speaker process, controlled via gRPC
|
|
3. **BGP Client Module** - New Rust module using `gobgp-client` crate or raw gRPC
|
|
|
|
### 4.2 Communication Flow
|
|
|
|
1. LoadBalancer created with VIP address
|
|
2. FiberLB checks backend health
|
|
3. When healthy backends exist → `AddPath(VIP/32)`
|
|
4. When all backends fail → `DeletePath(VIP/32)`
|
|
5. LoadBalancer deleted → `DeletePath(VIP/32)`
|
|
|
|
## 5. Implementation Design
|
|
|
|
### 5.1 New Module: `fiberlb-bgp`
|
|
|
|
```rust
|
|
// fiberlb/crates/fiberlb-bgp/src/lib.rs
|
|
|
|
pub struct BgpManager {
|
|
client: GobgpClient,
|
|
config: BgpConfig,
|
|
advertised_vips: HashSet<IpAddr>,
|
|
}
|
|
|
|
impl BgpManager {
|
|
/// Advertise a VIP to BGP peers
|
|
pub async fn advertise_vip(&mut self, vip: IpAddr) -> Result<()>;
|
|
|
|
/// Withdraw a VIP from BGP peers
|
|
pub async fn withdraw_vip(&mut self, vip: IpAddr) -> Result<()>;
|
|
|
|
/// Check if VIP is currently advertised
|
|
pub fn is_advertised(&self, vip: &IpAddr) -> bool;
|
|
}
|
|
```
|
|
|
|
### 5.2 Configuration Schema
|
|
|
|
```yaml
|
|
# fiberlb-server config
|
|
bgp:
|
|
enabled: true
|
|
gobgp_address: "127.0.0.1:50051" # GoBGP gRPC address
|
|
local_as: 65001
|
|
router_id: "10.0.0.1"
|
|
neighbors:
|
|
- address: "10.0.0.254"
|
|
remote_as: 65000
|
|
description: "ToR Router"
|
|
```
|
|
|
|
### 5.3 GoBGP Configuration (sidecar)
|
|
|
|
```yaml
|
|
# /etc/gobgp/gobgp.yaml
|
|
global:
|
|
config:
|
|
as: 65001
|
|
router-id: 10.0.0.1
|
|
port: 179
|
|
|
|
neighbors:
|
|
- config:
|
|
neighbor-address: 10.0.0.254
|
|
peer-as: 65000
|
|
afi-safis:
|
|
- config:
|
|
afi-safi-name: ipv4-unicast
|
|
add-paths:
|
|
config:
|
|
send-max: 8
|
|
```
|
|
|
|
### 5.4 Integration Points in FiberLB
|
|
|
|
```rust
|
|
// In loadbalancer_service.rs
|
|
|
|
impl LoadBalancerService {
|
|
async fn on_loadbalancer_active(&self, lb: &LoadBalancer) {
|
|
if let Some(vip) = &lb.vip_address {
|
|
if let Some(bgp) = &self.bgp_manager {
|
|
bgp.advertise_vip(vip.parse()?).await?;
|
|
}
|
|
}
|
|
}
|
|
|
|
async fn on_loadbalancer_deleted(&self, lb: &LoadBalancer) {
|
|
if let Some(vip) = &lb.vip_address {
|
|
if let Some(bgp) = &self.bgp_manager {
|
|
bgp.withdraw_vip(vip.parse()?).await?;
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## 6. Deployment Patterns
|
|
|
|
### 6.1 NixOS Module
|
|
|
|
```nix
|
|
# modules/fiberlb-bgp.nix
|
|
{ config, lib, pkgs, ... }:
|
|
|
|
{
|
|
services.fiberlb = {
|
|
bgp = {
|
|
enable = true;
|
|
localAs = 65001;
|
|
routerId = "10.0.0.1";
|
|
neighbors = [
|
|
{ address = "10.0.0.254"; remoteAs = 65000; }
|
|
];
|
|
};
|
|
};
|
|
|
|
# GoBGP sidecar
|
|
services.gobgpd = {
|
|
enable = true;
|
|
config = fiberlb-bgp-config;
|
|
};
|
|
}
|
|
```
|
|
|
|
### 6.2 Container/Pod Deployment
|
|
|
|
```yaml
|
|
# kubernetes deployment with sidecar
|
|
spec:
|
|
containers:
|
|
- name: fiberlb
|
|
image: plasmacloud/fiberlb:latest
|
|
env:
|
|
- name: BGP_GOBGP_ADDRESS
|
|
value: "localhost:50051"
|
|
|
|
- name: gobgp
|
|
image: osrg/gobgp:latest
|
|
args: ["-f", "/etc/gobgp/config.yaml"]
|
|
ports:
|
|
- containerPort: 179 # BGP
|
|
- containerPort: 50051 # gRPC
|
|
```
|
|
|
|
## 7. Health-Based VIP Withdrawal
|
|
|
|
### 7.1 Logic
|
|
|
|
```
|
|
┌─────────────────────────────────────────┐
|
|
│ Health Check Loop │
|
|
│ │
|
|
│ FOR each LoadBalancer WITH vip_address │
|
|
│ healthy_backends = count_healthy() │
|
|
│ │
|
|
│ IF healthy_backends > 0 │
|
|
│ AND NOT advertised(vip) │
|
|
│ THEN │
|
|
│ advertise(vip) │
|
|
│ │
|
|
│ IF healthy_backends == 0 │
|
|
│ AND advertised(vip) │
|
|
│ THEN │
|
|
│ withdraw(vip) │
|
|
│ │
|
|
└─────────────────────────────────────────┘
|
|
```
|
|
|
|
### 7.2 Graceful Shutdown
|
|
|
|
1. SIGTERM received
|
|
2. Withdraw all VIPs (allow BGP convergence)
|
|
3. Wait for configurable grace period (default: 5s)
|
|
4. Shutdown data plane
|
|
|
|
## 8. ECMP Support
|
|
|
|
With multiple FiberLB nodes advertising the same VIP:
|
|
|
|
```
|
|
┌─────────────┐
|
|
│ ToR Router │
|
|
│ (AS 65000) │
|
|
└──────┬──────┘
|
|
│ ECMP
|
|
┌──────────┼──────────┐
|
|
▼ ▼ ▼
|
|
┌─────────┐ ┌─────────┐ ┌─────────┐
|
|
│FiberLB-1│ │FiberLB-2│ │FiberLB-3│
|
|
│ VIP: X │ │ VIP: X │ │ VIP: X │
|
|
│AS 65001 │ │AS 65001 │ │AS 65001 │
|
|
└─────────┘ └─────────┘ └─────────┘
|
|
```
|
|
|
|
- All nodes advertise same VIP with same attributes
|
|
- Router distributes traffic via ECMP hashing
|
|
- Node failure = route withdrawal = automatic failover
|
|
|
|
## 9. Future Enhancements
|
|
|
|
1. **BFD (Bidirectional Forwarding Detection)** - Faster failure detection
|
|
2. **BGP Communities** - Traffic engineering support
|
|
3. **Route Filtering** - Export policies per neighbor
|
|
4. **RustyBGP Migration** - Switch from GoBGP for performance
|
|
5. **Embedded Speaker** - Long-term: native Rust BGP using zettabgp
|
|
|
|
## 10. Implementation Phases
|
|
|
|
### Phase 1: Basic Integration
|
|
- GoBGP sidecar deployment
|
|
- Simple VIP advertise/withdraw API
|
|
- Manual configuration
|
|
|
|
### Phase 2: Health-Based Control
|
|
- Automatic VIP withdrawal on backend failure
|
|
- Graceful shutdown handling
|
|
|
|
### Phase 3: Production Hardening
|
|
- BFD support
|
|
- Metrics and observability
|
|
- Operator documentation
|
|
|
|
## 11. References
|
|
|
|
- [GoBGP](https://osrg.github.io/gobgp/) - Official documentation
|
|
- [RustyBGP](https://github.com/osrg/rustybgp) - Rust BGP daemon
|
|
- [zettabgp](https://github.com/wladwm/zettabgp) - Rust BGP library
|
|
- [kube-vip BGP Mode](https://kube-vip.io/docs/modes/bgp/) - Similar pattern
|
|
- [MetalLB BGP](https://metallb.io/concepts/bgp/) - Kubernetes LB BGP
|
|
|
|
## 12. Decision Summary
|
|
|
|
| Decision | Choice | Rationale |
|
|
|----------|--------|-----------|
|
|
| Integration Pattern | Sidecar | Clear separation, proven pattern |
|
|
| BGP Daemon | GoBGP | Production maturity, extensive features |
|
|
| API | gRPC | Native GoBGP interface, language-agnostic |
|
|
| Future Path | RustyBGP | Same API, better performance when stable |
|