photoncloud-monorepo/docs/por/T055-fiberlb-features/S3-bgp-integration-spec.md
centra 3eeb303dcb feat: Batch commit for T039.S3 deployment
Includes all pending changes needed for nixos-anywhere:
- fiberlb: L7 policy, rule, certificate types
- deployer: New service for cluster management
- nix-nos: Generic network modules
- Various service updates and fixes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 04:34:51 +09:00

369 lines
11 KiB
Markdown

# T055.S3: BGP Integration Strategy Specification
**Author:** PeerA
**Date:** 2025-12-12
**Status:** DRAFT
## 1. Executive Summary
This document specifies the BGP Anycast integration strategy for FiberLB to enable VIP (Virtual IP) advertisement to upstream routers. The recommended approach is a **sidecar pattern** using GoBGP with gRPC API integration.
## 2. Background
### 2.1 Current State
- FiberLB binds listeners to `0.0.0.0:{port}` on each node
- LoadBalancer resources have `vip_address` field (currently unused for routing)
- No mechanism exists to advertise VIPs to physical network infrastructure
### 2.2 Requirements (from PROJECT.md Item 7)
- "BGP AnycastによるL2ロードバランシング" (BGP Anycast L2 LB)
- VIPs must be reachable from external networks
- Support for ECMP (Equal-Cost Multi-Path) across multiple FiberLB nodes
- Graceful withdrawal when load balancer is unhealthy/deleted
## 3. BGP Library Options Analysis
### 3.1 Option A: GoBGP Sidecar (RECOMMENDED)
**Description:** Run GoBGP as a sidecar container/process, control via gRPC API
| Aspect | Details |
|--------|---------|
| Language | Go |
| Maturity | Production-grade, widely deployed |
| API | gRPC with well-documented protobuf |
| Integration | FiberLB calls GoBGP gRPC to add/withdraw routes |
| Deployment | Separate process, co-located with FiberLB |
**Pros:**
- Battle-tested in production (Google, LINE, Yahoo Japan)
- Extensive BGP feature support (ECMP, BFD, RPKI)
- Clear separation of concerns
- Minimal code changes to FiberLB
**Cons:**
- External dependency (Go binary)
- Additional process management
- Network overhead for gRPC calls (minimal)
### 3.2 Option B: RustyBGP Sidecar
**Description:** Same sidecar pattern but using RustyBGP daemon
| Aspect | Details |
|--------|---------|
| Language | Rust |
| Maturity | Active development, less production deployment |
| API | GoBGP-compatible gRPC |
| Performance | Higher than GoBGP (multicore optimized) |
**Pros:**
- Rust ecosystem alignment
- Drop-in replacement for GoBGP (same API)
- Better performance in benchmarks
**Cons:**
- Less production history
- Smaller community
### 3.3 Option C: Embedded zettabgp
**Description:** Build custom BGP speaker using zettabgp library
| Aspect | Details |
|--------|---------|
| Language | Rust |
| Type | Parsing/composing library only |
| Integration | Embedded directly in FiberLB |
**Pros:**
- No external dependencies
- Full control over BGP behavior
- Single binary deployment
**Cons:**
- Significant implementation effort (FSM, timers, peer state)
- Risk of BGP protocol bugs
- Months of additional development
### 3.4 Option D: OVN Gateway Integration
**Description:** Leverage OVN's built-in BGP capabilities via OVN gateway router
| Aspect | Details |
|--------|---------|
| Dependency | Requires OVN deployment |
| Integration | FiberLB configures OVN via OVSDB |
**Pros:**
- No additional BGP daemon
- Integrated with SDN layer
**Cons:**
- Tightly couples to OVN
- Limited BGP feature set
- May not be deployed in all environments
## 4. Recommended Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ FiberLB Node │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ │ gRPC │ │ │
│ │ FiberLB │───────>│ GoBGP │──── BGP ──│──> ToR Router
│ │ Server │ │ Daemon │ │
│ │ │ │ │ │
│ └──────────────────┘ └──────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ VIP Traffic │ │
│ │ (Data Plane) │ │
│ └──────────────────┘ │
└─────────────────────────────────────────────────────────────┘
```
### 4.1 Components
1. **FiberLB Server** - Existing service, adds BGP client module
2. **GoBGP Daemon** - BGP speaker process, controlled via gRPC
3. **BGP Client Module** - New Rust module using `gobgp-client` crate or raw gRPC
### 4.2 Communication Flow
1. LoadBalancer created with VIP address
2. FiberLB checks backend health
3. When healthy backends exist → `AddPath(VIP/32)`
4. When all backends fail → `DeletePath(VIP/32)`
5. LoadBalancer deleted → `DeletePath(VIP/32)`
## 5. Implementation Design
### 5.1 New Module: `fiberlb-bgp`
```rust
// fiberlb/crates/fiberlb-bgp/src/lib.rs
pub struct BgpManager {
client: GobgpClient,
config: BgpConfig,
advertised_vips: HashSet<IpAddr>,
}
impl BgpManager {
/// Advertise a VIP to BGP peers
pub async fn advertise_vip(&mut self, vip: IpAddr) -> Result<()>;
/// Withdraw a VIP from BGP peers
pub async fn withdraw_vip(&mut self, vip: IpAddr) -> Result<()>;
/// Check if VIP is currently advertised
pub fn is_advertised(&self, vip: &IpAddr) -> bool;
}
```
### 5.2 Configuration Schema
```yaml
# fiberlb-server config
bgp:
enabled: true
gobgp_address: "127.0.0.1:50051" # GoBGP gRPC address
local_as: 65001
router_id: "10.0.0.1"
neighbors:
- address: "10.0.0.254"
remote_as: 65000
description: "ToR Router"
```
### 5.3 GoBGP Configuration (sidecar)
```yaml
# /etc/gobgp/gobgp.yaml
global:
config:
as: 65001
router-id: 10.0.0.1
port: 179
neighbors:
- config:
neighbor-address: 10.0.0.254
peer-as: 65000
afi-safis:
- config:
afi-safi-name: ipv4-unicast
add-paths:
config:
send-max: 8
```
### 5.4 Integration Points in FiberLB
```rust
// In loadbalancer_service.rs
impl LoadBalancerService {
async fn on_loadbalancer_active(&self, lb: &LoadBalancer) {
if let Some(vip) = &lb.vip_address {
if let Some(bgp) = &self.bgp_manager {
bgp.advertise_vip(vip.parse()?).await?;
}
}
}
async fn on_loadbalancer_deleted(&self, lb: &LoadBalancer) {
if let Some(vip) = &lb.vip_address {
if let Some(bgp) = &self.bgp_manager {
bgp.withdraw_vip(vip.parse()?).await?;
}
}
}
}
```
## 6. Deployment Patterns
### 6.1 NixOS Module
```nix
# modules/fiberlb-bgp.nix
{ config, lib, pkgs, ... }:
{
services.fiberlb = {
bgp = {
enable = true;
localAs = 65001;
routerId = "10.0.0.1";
neighbors = [
{ address = "10.0.0.254"; remoteAs = 65000; }
];
};
};
# GoBGP sidecar
services.gobgpd = {
enable = true;
config = fiberlb-bgp-config;
};
}
```
### 6.2 Container/Pod Deployment
```yaml
# kubernetes deployment with sidecar
spec:
containers:
- name: fiberlb
image: plasmacloud/fiberlb:latest
env:
- name: BGP_GOBGP_ADDRESS
value: "localhost:50051"
- name: gobgp
image: osrg/gobgp:latest
args: ["-f", "/etc/gobgp/config.yaml"]
ports:
- containerPort: 179 # BGP
- containerPort: 50051 # gRPC
```
## 7. Health-Based VIP Withdrawal
### 7.1 Logic
```
┌─────────────────────────────────────────┐
│ Health Check Loop │
│ │
│ FOR each LoadBalancer WITH vip_address │
│ healthy_backends = count_healthy() │
│ │
│ IF healthy_backends > 0 │
│ AND NOT advertised(vip) │
│ THEN │
│ advertise(vip) │
│ │
│ IF healthy_backends == 0 │
│ AND advertised(vip) │
│ THEN │
│ withdraw(vip) │
│ │
└─────────────────────────────────────────┘
```
### 7.2 Graceful Shutdown
1. SIGTERM received
2. Withdraw all VIPs (allow BGP convergence)
3. Wait for configurable grace period (default: 5s)
4. Shutdown data plane
## 8. ECMP Support
With multiple FiberLB nodes advertising the same VIP:
```
┌─────────────┐
│ ToR Router │
│ (AS 65000) │
└──────┬──────┘
│ ECMP
┌──────────┼──────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│FiberLB-1│ │FiberLB-2│ │FiberLB-3│
│ VIP: X │ │ VIP: X │ │ VIP: X │
│AS 65001 │ │AS 65001 │ │AS 65001 │
└─────────┘ └─────────┘ └─────────┘
```
- All nodes advertise same VIP with same attributes
- Router distributes traffic via ECMP hashing
- Node failure = route withdrawal = automatic failover
## 9. Future Enhancements
1. **BFD (Bidirectional Forwarding Detection)** - Faster failure detection
2. **BGP Communities** - Traffic engineering support
3. **Route Filtering** - Export policies per neighbor
4. **RustyBGP Migration** - Switch from GoBGP for performance
5. **Embedded Speaker** - Long-term: native Rust BGP using zettabgp
## 10. Implementation Phases
### Phase 1: Basic Integration
- GoBGP sidecar deployment
- Simple VIP advertise/withdraw API
- Manual configuration
### Phase 2: Health-Based Control
- Automatic VIP withdrawal on backend failure
- Graceful shutdown handling
### Phase 3: Production Hardening
- BFD support
- Metrics and observability
- Operator documentation
## 11. References
- [GoBGP](https://osrg.github.io/gobgp/) - Official documentation
- [RustyBGP](https://github.com/osrg/rustybgp) - Rust BGP daemon
- [zettabgp](https://github.com/wladwm/zettabgp) - Rust BGP library
- [kube-vip BGP Mode](https://kube-vip.io/docs/modes/bgp/) - Similar pattern
- [MetalLB BGP](https://metallb.io/concepts/bgp/) - Kubernetes LB BGP
## 12. Decision Summary
| Decision | Choice | Rationale |
|----------|--------|-----------|
| Integration Pattern | Sidecar | Clear separation, proven pattern |
| BGP Daemon | GoBGP | Production maturity, extensive features |
| API | gRPC | Native GoBGP interface, language-agnostic |
| Future Path | RustyBGP | Same API, better performance when stable |