id: T051 name: FiberLB Integration Testing goal: Validate FiberLB works correctly and integrates with other services for endpoint discovery status: planned priority: P1 owner: peerA created: 2025-12-12 depends_on: [] blocks: [T039] context: | **User Direction (2025-12-12):** "LBがちゃんと動くかも考えないといけませんね。これも重要な課題として(LBと他の結合試験)やる必要があります" "そもそもLBがちゃんと動かないならどのエンドポイントにアクセスしたら良いかわからない" **Rationale:** - LB is critical for service discovery - Without working LB, clients don't know which endpoint to access - Multiple instances of services need load balancing PROJECT.md Item 7: - MaglevによるL4ロードバランシング - BGP AnycastによるL2ロードバランシング - L7ロードバランシング acceptance: - FiberLB basic health check passes - L4 load balancing works (round-robin or Maglev) - Service registration/discovery works - Integration with k8shost Service objects - Integration with PlasmaVMC (VM endpoints) steps: - step: S1 name: FiberLB Current State Assessment done: Understand existing FiberLB implementation status: complete completed: 2025-12-12 01:50 JST owner: peerB priority: P0 notes: | **Architecture:** ~3100L Rust code, 3 crates - Control Plane: 5 gRPC services (LB, Pool, Backend, Listener, HealthCheck) - Data Plane: L4 TCP proxy (tokio bidirectional copy) - Metadata: ChainFire/FlareDB/InMemory backends - Integration: k8shost FiberLB controller (T028, 226L) **✓ IMPLEMENTED:** - L4 TCP load balancing (round-robin) - Health checks (TCP, HTTP, configurable intervals) - VIP allocation (203.0.113.0/24 TEST-NET-3) - Multi-tenant scoping (org_id/project_id) - k8shost Service integration (controller reconciles every 10s) - Graceful backend exclusion on health failure - NixOS packaging (systemd service) **✗ GAPS (Blocking Production):** CRITICAL: 1. Single Algorithm - Only round-robin works - Missing: Maglev (PROJECT.md requirement) - Missing: LeastConnections, IpHash, WeightedRR - No session persistence/affinity 2. No L7 HTTP Load Balancing - Only L4 TCP proxying - No path/host routing - No HTTP header inspection - No TLS termination 3. No BGP Anycast (PROJECT.md requirement) - Single-node data plane - No VIP advertisement - No ECMP support 4. Backend Discovery Gap - k8shost controller creates LB but doesn't register Pod endpoints - Need: Automatic backend registration from Service Endpoints HIGH: 5. MVP VIP Management - Sequential allocation, no reclamation 6. No HA/Failover - Single FiberLB instance 7. No Metrics - Missing request rate, latency, error metrics 8. No UDP Support - TCP only **Test Coverage:** - Control plane: 12 unit tests, 4 integration tests ✓ - Data plane: 1 ignored E2E test (requires real server) - k8shost integration: NO tests **Production Readiness:** LOW-MEDIUM - Works for basic L4 TCP - Needs: endpoint discovery, Maglev/IpHash, BGP, HA, metrics **Recommendation:** S2 Focus: E2E L4 test with 3 backends S3 Focus: Fix endpoint discovery gap, validate k8shost flow S4 Focus: Health check failover validation - step: S2 name: Basic LB Functionality Test done: Round-robin or Maglev L4 LB working status: pending owner: peerB priority: P0 notes: | Test: - Start multiple backend servers - Configure FiberLB - Verify requests are distributed - step: S3 name: k8shost Service Integration done: FiberLB provides VIP for k8shost Services with endpoint discovery status: complete completed: 2025-12-12 02:05 JST owner: peerB priority: P0 notes: | **Implementation (k8shost/crates/k8shost-server/src/fiberlb_controller.rs):** Enhanced FiberLB controller with complete endpoint discovery workflow: 1. Create LoadBalancer → receive VIP (existing) 2. Create Pool (RoundRobin, TCP) → NEW 3. Create Listener for each Service port → VIP:port → Pool → NEW 4. Query Pods matching Service.spec.selector → NEW 5. Create Backend for each Pod IP:targetPort → NEW **Changes:** - Added client connections: PoolService, ListenerService, BackendService - Store pool_id in Service annotations - Create Listener for each Service.spec.ports[] entry - Use storage.list_pods() with label_selector for endpoint discovery - Create Backend for each Pod with status.pod_ip - Handle target_port mapping (Service port → Container port) **Result:** - ✓ Compilation successful - ✓ Complete Service→VIP→Pool→Listener→Backend flow - ✓ Automatic Pod endpoint registration - ✓ Addresses user concern: "どのエンドポイントにアクセスしたら良いかわからない" **Next Steps:** - E2E validation: Deploy Service + Pods, verify VIP connectivity - S4: Health check failover validation - step: S4 name: Health Check and Failover done: Unhealthy backends removed from pool status: pending owner: peerB priority: P1 notes: | Test: - Active health checks - Remove failed backend - Recovery when backend returns evidence: [] notes: | **Strategic Value:** - LB is foundational for production deployment - Without working LB, multi-instance deployments are impossible - Critical for T039 production readiness **Related Work:** - T028: k8shost FiberLB Controller (already implemented) - T050.S6: k8shost REST API (includes Service endpoints)