- Replace form_urlencoded with RFC 3986 compliant URI encoding - Implement aws_uri_encode() matching AWS SigV4 spec exactly - Unreserved chars (A-Z,a-z,0-9,-,_,.,~) not encoded - All other chars percent-encoded with uppercase hex - Preserve slashes in paths, encode in query params - Normalize empty paths to '/' per AWS spec - Fix test expectations (body hash, HMAC values) - Add comprehensive SigV4 signature determinism test This fixes the canonicalization mismatch that caused signature validation failures in T047. Auth can now be enabled for production. Refs: T058.S1
168 lines
5.8 KiB
YAML
168 lines
5.8 KiB
YAML
id: T051
|
||
name: FiberLB Integration Testing
|
||
goal: Validate FiberLB works correctly and integrates with other services for endpoint discovery
|
||
status: planned
|
||
priority: P1
|
||
owner: peerA
|
||
created: 2025-12-12
|
||
depends_on: []
|
||
blocks: [T039]
|
||
|
||
context: |
|
||
**User Direction (2025-12-12):**
|
||
"LBがちゃんと動くかも考えないといけませんね。これも重要な課題として(LBと他の結合試験)やる必要があります"
|
||
"そもそもLBがちゃんと動かないならどのエンドポイントにアクセスしたら良いかわからない"
|
||
|
||
**Rationale:**
|
||
- LB is critical for service discovery
|
||
- Without working LB, clients don't know which endpoint to access
|
||
- Multiple instances of services need load balancing
|
||
|
||
PROJECT.md Item 7:
|
||
- MaglevによるL4ロードバランシング
|
||
- BGP AnycastによるL2ロードバランシング
|
||
- L7ロードバランシング
|
||
|
||
acceptance:
|
||
- FiberLB basic health check passes
|
||
- L4 load balancing works (round-robin or Maglev)
|
||
- Service registration/discovery works
|
||
- Integration with k8shost Service objects
|
||
- Integration with PlasmaVMC (VM endpoints)
|
||
|
||
steps:
|
||
- step: S1
|
||
name: FiberLB Current State Assessment
|
||
done: Understand existing FiberLB implementation
|
||
status: complete
|
||
completed: 2025-12-12 01:50 JST
|
||
owner: peerB
|
||
priority: P0
|
||
notes: |
|
||
**Architecture:** ~3100L Rust code, 3 crates
|
||
- Control Plane: 5 gRPC services (LB, Pool, Backend, Listener, HealthCheck)
|
||
- Data Plane: L4 TCP proxy (tokio bidirectional copy)
|
||
- Metadata: ChainFire/FlareDB/InMemory backends
|
||
- Integration: k8shost FiberLB controller (T028, 226L)
|
||
|
||
**✓ IMPLEMENTED:**
|
||
- L4 TCP load balancing (round-robin)
|
||
- Health checks (TCP, HTTP, configurable intervals)
|
||
- VIP allocation (203.0.113.0/24 TEST-NET-3)
|
||
- Multi-tenant scoping (org_id/project_id)
|
||
- k8shost Service integration (controller reconciles every 10s)
|
||
- Graceful backend exclusion on health failure
|
||
- NixOS packaging (systemd service)
|
||
|
||
**✗ GAPS (Blocking Production):**
|
||
|
||
CRITICAL:
|
||
1. Single Algorithm - Only round-robin works
|
||
- Missing: Maglev (PROJECT.md requirement)
|
||
- Missing: LeastConnections, IpHash, WeightedRR
|
||
- No session persistence/affinity
|
||
|
||
2. No L7 HTTP Load Balancing
|
||
- Only L4 TCP proxying
|
||
- No path/host routing
|
||
- No HTTP header inspection
|
||
- No TLS termination
|
||
|
||
3. No BGP Anycast (PROJECT.md requirement)
|
||
- Single-node data plane
|
||
- No VIP advertisement
|
||
- No ECMP support
|
||
|
||
4. Backend Discovery Gap
|
||
- k8shost controller creates LB but doesn't register Pod endpoints
|
||
- Need: Automatic backend registration from Service Endpoints
|
||
|
||
HIGH:
|
||
5. MVP VIP Management - Sequential allocation, no reclamation
|
||
6. No HA/Failover - Single FiberLB instance
|
||
7. No Metrics - Missing request rate, latency, error metrics
|
||
8. No UDP Support - TCP only
|
||
|
||
**Test Coverage:**
|
||
- Control plane: 12 unit tests, 4 integration tests ✓
|
||
- Data plane: 1 ignored E2E test (requires real server)
|
||
- k8shost integration: NO tests
|
||
|
||
**Production Readiness:** LOW-MEDIUM
|
||
- Works for basic L4 TCP
|
||
- Needs: endpoint discovery, Maglev/IpHash, BGP, HA, metrics
|
||
|
||
**Recommendation:**
|
||
S2 Focus: E2E L4 test with 3 backends
|
||
S3 Focus: Fix endpoint discovery gap, validate k8shost flow
|
||
S4 Focus: Health check failover validation
|
||
|
||
- step: S2
|
||
name: Basic LB Functionality Test
|
||
done: Round-robin or Maglev L4 LB working
|
||
status: pending
|
||
owner: peerB
|
||
priority: P0
|
||
notes: |
|
||
Test:
|
||
- Start multiple backend servers
|
||
- Configure FiberLB
|
||
- Verify requests are distributed
|
||
|
||
- step: S3
|
||
name: k8shost Service Integration
|
||
done: FiberLB provides VIP for k8shost Services with endpoint discovery
|
||
status: complete
|
||
completed: 2025-12-12 02:05 JST
|
||
owner: peerB
|
||
priority: P0
|
||
notes: |
|
||
**Implementation (k8shost/crates/k8shost-server/src/fiberlb_controller.rs):**
|
||
Enhanced FiberLB controller with complete endpoint discovery workflow:
|
||
|
||
1. Create LoadBalancer → receive VIP (existing)
|
||
2. Create Pool (RoundRobin, TCP) → NEW
|
||
3. Create Listener for each Service port → VIP:port → Pool → NEW
|
||
4. Query Pods matching Service.spec.selector → NEW
|
||
5. Create Backend for each Pod IP:targetPort → NEW
|
||
|
||
**Changes:**
|
||
- Added client connections: PoolService, ListenerService, BackendService
|
||
- Store pool_id in Service annotations
|
||
- Create Listener for each Service.spec.ports[] entry
|
||
- Use storage.list_pods() with label_selector for endpoint discovery
|
||
- Create Backend for each Pod with status.pod_ip
|
||
- Handle target_port mapping (Service port → Container port)
|
||
|
||
**Result:**
|
||
- ✓ Compilation successful
|
||
- ✓ Complete Service→VIP→Pool→Listener→Backend flow
|
||
- ✓ Automatic Pod endpoint registration
|
||
- ✓ Addresses user concern: "どのエンドポイントにアクセスしたら良いかわからない"
|
||
|
||
**Next Steps:**
|
||
- E2E validation: Deploy Service + Pods, verify VIP connectivity
|
||
- S4: Health check failover validation
|
||
|
||
- step: S4
|
||
name: Health Check and Failover
|
||
done: Unhealthy backends removed from pool
|
||
status: pending
|
||
owner: peerB
|
||
priority: P1
|
||
notes: |
|
||
Test:
|
||
- Active health checks
|
||
- Remove failed backend
|
||
- Recovery when backend returns
|
||
|
||
evidence: []
|
||
notes: |
|
||
**Strategic Value:**
|
||
- LB is foundational for production deployment
|
||
- Without working LB, multi-instance deployments are impossible
|
||
- Critical for T039 production readiness
|
||
|
||
**Related Work:**
|
||
- T028: k8shost FiberLB Controller (already implemented)
|
||
- T050.S6: k8shost REST API (includes Service endpoints)
|