photoncloud-monorepo/specifications/fiberlb/README.md
centra 5c6eb04a46 T036: Add VM cluster deployment configs for nixos-anywhere
- netboot-base.nix with SSH key auth
- Launch scripts for node01/02/03
- Node configuration.nix and disko.nix
- Nix modules for first-boot automation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 09:59:19 +09:00

54 KiB

FiberLB Specification

Version: 1.0 | Status: Draft | Last Updated: 2025-12-08

1. Overview

1.1 Purpose

FiberLB is a multi-tenant load balancer service providing L4 (TCP/UDP) and L7 (HTTP/HTTPS) traffic distribution for the cloud platform. It enables organizations and projects to create and manage load balancers that distribute traffic across backend pools of PlasmaVMC virtual machines with configurable algorithms, health checking, and TLS termination.

The name "FiberLB" reflects high-speed, reliable traffic distribution with the "Fiber" prefix denoting throughput and the cloud platform family branding.

1.2 Scope

  • In scope: L4 load balancing (TCP, UDP), L7 load balancing (HTTP, HTTPS), TLS termination and passthrough, backend pool management, multiple load balancing algorithms (RoundRobin, LeastConnections, IpHash, Random, WeightedRoundRobin), active health checks (TCP, HTTP, HTTPS, DNS), circuit breaker patterns, multi-tenant LBs (org/project scoped), gRPC management API, aegis integration for access control, ChainFire storage backend, FlashDNS integration for DNS-based health checks
  • Out of scope: Global load balancing (GeoDNS-based), API gateway features (rate limiting, request transformation), Web Application Firewall (WAF), DDoS mitigation (handled at network layer), Service mesh integration (planned), automatic TLS certificate provisioning (planned via LightningStor integration)

1.3 Design Goals

  • Dual-mode operation: Both L4 and L7 load balancing in a single service
  • Multi-tenant from day one: Full org/project LB isolation with aegis integration
  • High-performance data plane: Low-latency traffic forwarding with minimal overhead
  • Flexible health checking: Multiple health check types with circuit breaker support
  • Algorithm diversity: Support common load balancing algorithms for different use cases
  • Cloud-native management: gRPC API for LB/pool/backend management, Prometheus metrics
  • Consistent storage: ChainFire for LB configuration persistence with strong consistency

2. Architecture

2.1 Crate Structure

fiberlb/
├── crates/
│   ├── fiberlb-api/       # gRPC service implementations
│   ├── fiberlb-client/    # Rust client library
│   ├── fiberlb-server/    # Server binary (control + data plane)
│   ├── fiberlb-proxy/     # L4/L7 proxy engine
│   └── fiberlb-types/     # Shared types (LoadBalancer, Pool, Backend, etc.)
└── proto/
    └── fiberlb.proto      # gRPC API definitions

2.2 Component Topology

┌─────────────────────────────────────────────────────────────────────┐
│                        FiberLB Server                                │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────────┐  │
│  │  fiberlb-proxy  │  │   fiberlb-api   │  │   fiberlb-types     │  │
│  │  (L4/L7 data    │  │     (gRPC)      │  │   (core types)      │  │
│  │   plane)        │  │                 │  │                     │  │
│  └────────┬────────┘  └────────┬────────┘  └──────────┬──────────┘  │
│           │                    │                      │             │
│           └────────────────────┼──────────────────────┘             │
│                                │                                     │
│                         ┌──────▼──────┐                             │
│                         │    Core     │                             │
│                         │  (config,   │                             │
│                         │   health,   │                             │
│                         │   routing)  │                             │
│                         └──────┬──────┘                             │
└────────────────────────────────┼────────────────────────────────────┘
                                 │
                    ┌────────────┼────────────┬────────────┐
                    ▼            ▼            ▼            ▼
             ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐
             │ ChainFire │ │   Aegis   │ │ FlashDNS  │ │ PlasmaVMC │
             │ (storage) │ │   (IAM)   │ │ (DNS HC)  │ │ (backends)│
             └───────────┘ └───────────┘ └───────────┘ └───────────┘

2.3 Data Flow

L4 Traffic Flow (TCP/UDP):

[Client] → [Listener :port] → [Connection Accept]
                                     │
                              [Backend Selection]
                               (algorithm + health)
                                     │
                              [Connection Forward]
                                     │
                              [Backend Server]
                                     │
                              [Response Relay]
                                     │
                              [Client]

L7 Traffic Flow (HTTP/HTTPS):

[Client] → [Listener :port] → [TLS Termination (if HTTPS)]
                                     │
                              [HTTP Parse]
                                     │
                              [Route Matching]
                               (host, path, headers)
                                     │
                              [Backend Selection]
                               (pool + algorithm)
                                     │
                              [Request Forward]
                               (+ headers: X-Forwarded-*)
                                     │
                              [Backend Server]
                                     │
                              [Response Relay]
                                     │
                              [Client]

Management API Flow:

[gRPC Client] → [fiberlb-api] → [Aegis AuthZ] → [Core Service]
                                                       │
                                               [ChainFire Store]
                                                       │
                                               [Config Reload]
                                                       │
                                               [Data Plane Update]

2.4 Dependencies

Crate Version Purpose
tokio 1.x Async runtime
tonic 0.12 gRPC framework
prost 0.13 Protocol buffers
hyper 1.x HTTP/1.1 and HTTP/2
tokio-rustls 0.26 TLS for HTTPS
dashmap 6.x Concurrent backend/pool state
uuid 1.x LB/pool/backend identifiers

3. Core Concepts

3.1 LoadBalancer

A load balancer instance scoped to an organization and optionally a project.

pub struct LoadBalancer {
    pub id: String,                      // UUID
    pub name: String,                    // Display name (unique within scope)
    pub org_id: String,                  // Owner organization
    pub project_id: Option<String>,      // Optional project scope
    pub description: Option<String>,
    pub listeners: Vec<ListenerId>,      // Associated listeners
    pub default_pool_id: Option<String>, // Default backend pool
    pub status: LbStatus,
    pub created_at: u64,                 // Creation timestamp (Unix ms)
    pub updated_at: u64,                 // Last modification
    pub created_by: String,              // Principal ID
    pub metadata: HashMap<String, String>,
    pub tags: HashMap<String, String>,
}

pub enum LbStatus {
    Creating,                            // Being provisioned
    Active,                              // Operational
    Updating,                            // Configuration change in progress
    Error,                               // Provisioning/config error
    Deleting,                            // Being removed
    Disabled,                            // Manually disabled
}

pub struct LbStatusInfo {
    pub status: LbStatus,
    pub message: Option<String>,         // Error details if applicable
    pub last_transition: u64,            // When status changed
}

Naming Rules:

  • 1-63 characters
  • Lowercase alphanumeric and hyphens
  • Must start with letter
  • Must end with alphanumeric
  • Unique within org (or project if project-scoped)

3.2 Listener

A network endpoint that accepts incoming traffic for a load balancer.

pub struct Listener {
    pub id: String,                      // UUID
    pub lb_id: String,                   // Parent load balancer
    pub name: String,                    // Display name
    pub protocol: ListenerProtocol,
    pub port: u16,                       // Listen port (1-65535)
    pub tls_config: Option<TlsConfig>,   // For HTTPS/TLS listeners
    pub default_pool_id: Option<String>, // Pool for unmatched requests
    pub rules: Vec<RoutingRule>,         // L7 routing rules
    pub connection_limit: Option<u32>,   // Max concurrent connections
    pub timeout_client: u32,             // Client timeout (ms)
    pub timeout_backend: u32,            // Backend timeout (ms)
    pub enabled: bool,
    pub created_at: u64,
    pub updated_at: u64,
}

pub enum ListenerProtocol {
    Tcp,                                 // L4 TCP
    Udp,                                 // L4 UDP
    Http,                                // L7 HTTP
    Https,                               // L7 HTTPS (TLS termination)
    TcpTls,                              // L4 TCP with TLS passthrough
}

pub struct TlsConfig {
    pub certificate_id: String,          // Reference to cert in LightningStor
    pub min_version: TlsVersion,         // Minimum TLS version
    pub cipher_suites: Vec<String>,      // Allowed cipher suites (or default)
    pub client_auth: ClientAuthMode,     // mTLS settings
    pub sni_certificates: HashMap<String, String>, // SNI -> cert_id mapping
}

pub enum TlsVersion {
    Tls12,
    Tls13,
}

pub enum ClientAuthMode {
    None,                                // No client cert required
    Optional,                            // Request but don't require
    Required,                            // Require valid client cert
}

Port Restrictions:

  • Privileged ports (1-1023) require elevated permissions
  • Port conflicts prevented within same LB
  • Well-known ports: 80 (HTTP), 443 (HTTPS), custom for TCP/UDP

3.3 RoutingRule

L7 routing rules for HTTP/HTTPS listeners.

pub struct RoutingRule {
    pub id: String,                      // UUID
    pub name: String,
    pub priority: u32,                   // Lower = higher priority
    pub conditions: Vec<RuleCondition>,  // AND logic
    pub action: RuleAction,
    pub enabled: bool,
}

pub enum RuleCondition {
    HostHeader {
        values: Vec<String>,             // Exact match or wildcard (*.example.com)
    },
    PathPrefix {
        value: String,                   // e.g., "/api/"
    },
    PathExact {
        value: String,                   // e.g., "/health"
    },
    PathRegex {
        pattern: String,                 // Regex pattern
    },
    Header {
        name: String,
        values: Vec<String>,             // Match any
    },
    Method {
        methods: Vec<HttpMethod>,
    },
    QueryParam {
        name: String,
        value: String,
    },
    SourceIp {
        cidrs: Vec<String>,              // Client IP CIDR match
    },
}

pub enum RuleAction {
    ForwardToPool {
        pool_id: String,
    },
    Redirect {
        url: String,
        status_code: u16,                // 301, 302, 307, 308
    },
    FixedResponse {
        status_code: u16,
        content_type: String,
        body: String,
    },
}

pub enum HttpMethod {
    Get, Post, Put, Delete, Patch, Head, Options,
}

3.4 Pool

A backend pool containing a group of servers with a load balancing algorithm.

pub struct Pool {
    pub id: String,                      // UUID
    pub lb_id: String,                   // Parent load balancer
    pub name: String,                    // Display name
    pub algorithm: Algorithm,
    pub backends: Vec<BackendId>,        // Member backends
    pub health_check: Option<HealthCheck>,
    pub session_persistence: Option<SessionPersistence>,
    pub circuit_breaker: Option<CircuitBreakerConfig>,
    pub enabled: bool,
    pub created_at: u64,
    pub updated_at: u64,
}

pub enum Algorithm {
    RoundRobin,                          // Sequential distribution
    LeastConnections,                    // Fewest active connections
    IpHash,                              // Consistent hashing by client IP
    Random,                              // Random selection
    WeightedRoundRobin,                  // Round robin with weights
    LeastResponseTime,                   // Fastest backend (requires active monitoring)
}

pub struct SessionPersistence {
    pub mode: PersistenceMode,
    pub cookie_name: Option<String>,     // For cookie-based
    pub ttl_seconds: u32,                // Session timeout
}

pub enum PersistenceMode {
    None,
    SourceIp,                            // Stick by client IP
    Cookie,                              // Insert/track LB cookie
    AppCookie,                           // Track application cookie
}

Algorithm Selection Guide:

Algorithm Best For Considerations
RoundRobin Equal capacity backends Simple, even distribution
LeastConnections Variable request duration Tracks connection count
IpHash Session affinity without cookies May unbalance with NAT
Random Simple, stateless Good entropy needed
WeightedRoundRobin Mixed capacity backends Manual weight tuning
LeastResponseTime Performance-sensitive Requires active probing

3.5 Backend

An individual backend server (typically a PlasmaVMC VM) in a pool.

pub struct Backend {
    pub id: String,                      // UUID
    pub pool_id: String,                 // Parent pool
    pub name: String,                    // Display name
    pub address: BackendAddress,
    pub port: u16,                       // Backend port
    pub weight: u32,                     // For weighted algorithms (1-100, default: 1)
    pub status: BackendStatus,
    pub health_status: HealthStatus,
    pub enabled: bool,                   // Admin enable/disable
    pub metadata: HashMap<String, String>,
    pub created_at: u64,
    pub updated_at: u64,
}

pub enum BackendAddress {
    Ip(IpAddr),                          // Direct IP address
    Hostname(String),                    // DNS hostname (resolved)
    VmId(String),                        // PlasmaVMC VM ID (resolved via API)
}

pub enum BackendStatus {
    Creating,                            // Being added
    Active,                              // Ready to receive traffic
    Draining,                            // Graceful removal (no new connections)
    Removed,                             // Removed from pool
}

pub struct HealthStatus {
    pub healthy: bool,
    pub last_check: u64,                 // Timestamp of last check
    pub last_healthy: Option<u64>,       // Last time marked healthy
    pub consecutive_failures: u32,
    pub consecutive_successes: u32,
    pub last_error: Option<String>,      // Most recent failure reason
}

3.6 HealthCheck

Configuration for backend health monitoring.

pub struct HealthCheck {
    pub id: String,                      // UUID
    pub check_type: HealthCheckType,
    pub interval: u32,                   // Check interval (seconds)
    pub timeout: u32,                    // Check timeout (seconds)
    pub healthy_threshold: u32,          // Consecutive successes to mark healthy
    pub unhealthy_threshold: u32,        // Consecutive failures to mark unhealthy
    pub enabled: bool,
}

pub enum HealthCheckType {
    Tcp {
        // Just TCP connection success
    },
    Http {
        path: String,                    // e.g., "/health"
        method: HttpMethod,              // GET, HEAD
        expected_codes: Vec<u16>,        // e.g., [200, 204]
        host_header: Option<String>,
        headers: HashMap<String, String>,
    },
    Https {
        path: String,
        method: HttpMethod,
        expected_codes: Vec<u16>,
        host_header: Option<String>,
        headers: HashMap<String, String>,
        verify_tls: bool,                // Verify backend cert
    },
    Dns {
        hostname: String,                // Query via FlashDNS
        record_type: DnsRecordType,      // A, AAAA
        expected_address: Option<String>, // Expected response
    },
    Grpc {
        service: Option<String>,         // gRPC health check service
    },
}

pub enum DnsRecordType {
    A,
    Aaaa,
}

impl Default for HealthCheck {
    fn default() -> Self {
        Self {
            id: uuid::Uuid::new_v4().to_string(),
            check_type: HealthCheckType::Tcp {},
            interval: 30,                // 30 seconds
            timeout: 10,                 // 10 seconds
            healthy_threshold: 2,        // 2 consecutive successes
            unhealthy_threshold: 3,      // 3 consecutive failures
            enabled: true,
        }
    }
}

3.7 CircuitBreaker

Circuit breaker configuration for backend fault isolation.

pub struct CircuitBreakerConfig {
    pub enabled: bool,
    pub failure_threshold: u32,          // Failures to open circuit
    pub success_threshold: u32,          // Successes to close circuit
    pub timeout_seconds: u32,            // Time in open state before half-open
    pub failure_rate_threshold: f32,     // 0.0-1.0, alternative to count
    pub min_request_volume: u32,         // Min requests before rate calculation
    pub slow_call_threshold_ms: u32,     // Response time to count as slow
    pub slow_call_rate_threshold: f32,   // Slow call rate to open circuit
}

pub enum CircuitState {
    Closed,                              // Normal operation
    Open,                                // Failing, reject requests
    HalfOpen,                            // Testing recovery
}

impl Default for CircuitBreakerConfig {
    fn default() -> Self {
        Self {
            enabled: false,
            failure_threshold: 5,
            success_threshold: 3,
            timeout_seconds: 60,
            failure_rate_threshold: 0.5,
            min_request_volume: 10,
            slow_call_threshold_ms: 5000,
            slow_call_rate_threshold: 0.8,
        }
    }
}

Circuit Breaker States:

[Closed] ──(failures >= threshold)──► [Open]
    ▲                                    │
    │                              (timeout expires)
    │                                    ▼
    └───(successes >= threshold)─── [Half-Open]
                                         │
                                   (failure)
                                         ▼
                                      [Open]

4. API

4.1 gRPC Services

LbService (fiberlb.v1.LbService)

service LbService {
  // Load Balancer CRUD
  rpc CreateLoadBalancer(CreateLoadBalancerRequest) returns (LoadBalancer);
  rpc GetLoadBalancer(GetLoadBalancerRequest) returns (LoadBalancer);
  rpc UpdateLoadBalancer(UpdateLoadBalancerRequest) returns (LoadBalancer);
  rpc DeleteLoadBalancer(DeleteLoadBalancerRequest) returns (DeleteLoadBalancerResponse);
  rpc ListLoadBalancers(ListLoadBalancersRequest) returns (ListLoadBalancersResponse);

  // Status operations
  rpc GetLoadBalancerStatus(GetLoadBalancerStatusRequest) returns (LoadBalancerStatus);
  rpc EnableLoadBalancer(EnableLoadBalancerRequest) returns (LoadBalancer);
  rpc DisableLoadBalancer(DisableLoadBalancerRequest) returns (LoadBalancer);
}

message CreateLoadBalancerRequest {
  string name = 1;
  string org_id = 2;
  optional string project_id = 3;
  optional string description = 4;
  map<string, string> tags = 5;
  map<string, string> metadata = 6;
}

message GetLoadBalancerRequest {
  string lb_id = 1;
}

message UpdateLoadBalancerRequest {
  string lb_id = 1;
  optional string name = 2;
  optional string description = 3;
  optional string default_pool_id = 4;
  map<string, string> tags = 5;
}

message DeleteLoadBalancerRequest {
  string lb_id = 1;
  bool force = 2;                        // Delete even with active listeners
}

message ListLoadBalancersRequest {
  string org_id = 1;
  optional string project_id = 2;
  optional string name_filter = 3;       // Prefix match
  optional LbStatus status_filter = 4;
  uint32 limit = 5;                      // Max results (default: 100)
  string page_token = 6;
}

message ListLoadBalancersResponse {
  repeated LoadBalancer load_balancers = 1;
  string next_page_token = 2;
  uint32 total_count = 3;
}

ListenerService (fiberlb.v1.ListenerService)

service ListenerService {
  rpc CreateListener(CreateListenerRequest) returns (Listener);
  rpc GetListener(GetListenerRequest) returns (Listener);
  rpc UpdateListener(UpdateListenerRequest) returns (Listener);
  rpc DeleteListener(DeleteListenerRequest) returns (DeleteListenerResponse);
  rpc ListListeners(ListListenersRequest) returns (ListListenersResponse);

  // Routing rules
  rpc AddRoutingRule(AddRoutingRuleRequest) returns (Listener);
  rpc UpdateRoutingRule(UpdateRoutingRuleRequest) returns (Listener);
  rpc RemoveRoutingRule(RemoveRoutingRuleRequest) returns (Listener);
}

message CreateListenerRequest {
  string lb_id = 1;
  string name = 2;
  ListenerProtocol protocol = 3;
  uint32 port = 4;
  optional TlsConfig tls_config = 5;
  optional string default_pool_id = 6;
  optional uint32 connection_limit = 7;
  optional uint32 timeout_client = 8;
  optional uint32 timeout_backend = 9;
}

message UpdateListenerRequest {
  string listener_id = 1;
  optional string name = 2;
  optional TlsConfig tls_config = 3;
  optional string default_pool_id = 4;
  optional uint32 connection_limit = 5;
  optional uint32 timeout_client = 6;
  optional uint32 timeout_backend = 7;
  optional bool enabled = 8;
}

message ListListenersRequest {
  string lb_id = 1;
  optional ListenerProtocol protocol_filter = 2;
  uint32 limit = 3;
  string page_token = 4;
}

PoolService (fiberlb.v1.PoolService)

service PoolService {
  rpc CreatePool(CreatePoolRequest) returns (Pool);
  rpc GetPool(GetPoolRequest) returns (Pool);
  rpc UpdatePool(UpdatePoolRequest) returns (Pool);
  rpc DeletePool(DeletePoolRequest) returns (DeletePoolResponse);
  rpc ListPools(ListPoolsRequest) returns (ListPoolsResponse);

  // Health check management
  rpc SetHealthCheck(SetHealthCheckRequest) returns (Pool);
  rpc RemoveHealthCheck(RemoveHealthCheckRequest) returns (Pool);

  // Pool status
  rpc GetPoolStatus(GetPoolStatusRequest) returns (PoolStatus);
}

message CreatePoolRequest {
  string lb_id = 1;
  string name = 2;
  Algorithm algorithm = 3;
  optional HealthCheck health_check = 4;
  optional SessionPersistence session_persistence = 5;
  optional CircuitBreakerConfig circuit_breaker = 6;
}

message UpdatePoolRequest {
  string pool_id = 1;
  optional string name = 2;
  optional Algorithm algorithm = 3;
  optional SessionPersistence session_persistence = 4;
  optional CircuitBreakerConfig circuit_breaker = 5;
  optional bool enabled = 6;
}

message ListPoolsRequest {
  string lb_id = 1;
  uint32 limit = 2;
  string page_token = 3;
}

message PoolStatus {
  string pool_id = 1;
  uint32 total_backends = 2;
  uint32 healthy_backends = 3;
  uint32 unhealthy_backends = 4;
  uint32 draining_backends = 5;
  CircuitState circuit_state = 6;
}

BackendService (fiberlb.v1.BackendService)

service BackendService {
  rpc AddBackend(AddBackendRequest) returns (Backend);
  rpc GetBackend(GetBackendRequest) returns (Backend);
  rpc UpdateBackend(UpdateBackendRequest) returns (Backend);
  rpc RemoveBackend(RemoveBackendRequest) returns (RemoveBackendResponse);
  rpc ListBackends(ListBackendsRequest) returns (ListBackendsResponse);

  // Drain operations
  rpc DrainBackend(DrainBackendRequest) returns (Backend);
  rpc UndainBackend(UndrainBackendRequest) returns (Backend);

  // Health status
  rpc GetBackendHealth(GetBackendHealthRequest) returns (HealthStatus);
  rpc ListBackendHealth(ListBackendHealthRequest) returns (ListBackendHealthResponse);
}

message AddBackendRequest {
  string pool_id = 1;
  string name = 2;
  BackendAddress address = 3;
  uint32 port = 4;
  optional uint32 weight = 5;            // Default: 1
  map<string, string> metadata = 6;
}

message UpdateBackendRequest {
  string backend_id = 1;
  optional string name = 2;
  optional uint32 port = 3;
  optional uint32 weight = 4;
  optional bool enabled = 5;
}

message RemoveBackendRequest {
  string backend_id = 1;
  bool drain_first = 2;                  // Graceful removal
  optional uint32 drain_timeout_seconds = 3;
}

message ListBackendsRequest {
  string pool_id = 1;
  optional bool healthy_only = 2;
  uint32 limit = 3;
  string page_token = 4;
}

message DrainBackendRequest {
  string backend_id = 1;
  optional uint32 timeout_seconds = 2;   // Max drain time
}

message ListBackendHealthResponse {
  repeated BackendHealthEntry entries = 1;
}

message BackendHealthEntry {
  string backend_id = 1;
  string backend_name = 2;
  HealthStatus health_status = 3;
}

4.2 Authentication

gRPC API:

  • aegis bearer tokens in authorization metadata
  • mTLS for service-to-service communication
  • API key header (x-api-key) for programmatic access

Data Plane:

  • No authentication for load balanced traffic (passthrough)
  • mTLS between LB and backends (optional)
  • Client certificate validation (optional, for HTTPS listeners)

4.3 Client Library

use fiberlb_client::FiberLbClient;

let client = FiberLbClient::connect("http://127.0.0.1:6300").await?;

// Create load balancer
let lb = client.create_load_balancer(CreateLoadBalancerRequest {
    name: "web-lb".into(),
    org_id: "acme".into(),
    project_id: Some("web-prod".into()),
    description: Some("Production web load balancer".into()),
    ..Default::default()
}).await?;

// Create HTTP listener
let listener = client.create_listener(CreateListenerRequest {
    lb_id: lb.id.clone(),
    name: "http".into(),
    protocol: ListenerProtocol::Http,
    port: 80,
    ..Default::default()
}).await?;

// Create HTTPS listener with TLS
let https_listener = client.create_listener(CreateListenerRequest {
    lb_id: lb.id.clone(),
    name: "https".into(),
    protocol: ListenerProtocol::Https,
    port: 443,
    tls_config: Some(TlsConfig {
        certificate_id: "cert-123".into(),
        min_version: TlsVersion::Tls12,
        ..Default::default()
    }),
    ..Default::default()
}).await?;

// Create backend pool
let pool = client.create_pool(CreatePoolRequest {
    lb_id: lb.id.clone(),
    name: "web-backends".into(),
    algorithm: Algorithm::LeastConnections,
    health_check: Some(HealthCheck {
        check_type: HealthCheckType::Http {
            path: "/health".into(),
            method: HttpMethod::Get,
            expected_codes: vec![200],
            ..Default::default()
        },
        interval: 10,
        timeout: 5,
        healthy_threshold: 2,
        unhealthy_threshold: 3,
        ..Default::default()
    }),
    session_persistence: Some(SessionPersistence {
        mode: PersistenceMode::Cookie,
        cookie_name: Some("SERVERID".into()),
        ttl_seconds: 3600,
    }),
    ..Default::default()
}).await?;

// Add backends (PlasmaVMC VMs)
for (i, vm_id) in ["vm-001", "vm-002", "vm-003"].iter().enumerate() {
    client.add_backend(AddBackendRequest {
        pool_id: pool.id.clone(),
        name: format!("web-{}", i + 1),
        address: BackendAddress::VmId(vm_id.to_string()),
        port: 8080,
        weight: Some(1),
        ..Default::default()
    }).await?;
}

// Update listener to use pool
client.update_listener(UpdateListenerRequest {
    listener_id: listener.id.clone(),
    default_pool_id: Some(pool.id.clone()),
    ..Default::default()
}).await?;

// Add L7 routing rules
client.add_routing_rule(AddRoutingRuleRequest {
    listener_id: https_listener.id.clone(),
    rule: RoutingRule {
        name: "api-route".into(),
        priority: 10,
        conditions: vec![
            RuleCondition::PathPrefix { value: "/api/".into() },
        ],
        action: RuleAction::ForwardToPool { pool_id: pool.id.clone() },
        enabled: true,
        ..Default::default()
    },
}).await?;

// Check pool health status
let status = client.get_pool_status(GetPoolStatusRequest {
    pool_id: pool.id.clone(),
}).await?;
println!("Healthy backends: {}/{}", status.healthy_backends, status.total_backends);

// Drain backend for maintenance
client.drain_backend(DrainBackendRequest {
    backend_id: "backend-001".into(),
    timeout_seconds: Some(30),
}).await?;

5. Multi-Tenancy

5.1 Scope Hierarchy

System (platform operators)
  └─ Organization (tenant boundary)
      ├─ Org-level load balancers (shared across projects)
      └─ Project (workload isolation)
          └─ Project-level load balancers

5.2 LoadBalancer Scoping

pub enum LbScope {
    /// LB accessible to all projects in org
    Organization { org_id: String },

    /// LB scoped to specific project
    Project { org_id: String, project_id: String },
}

impl LoadBalancer {
    pub fn scope(&self) -> LbScope {
        match &self.project_id {
            Some(pid) => LbScope::Project {
                org_id: self.org_id.clone(),
                project_id: pid.clone()
            },
            None => LbScope::Organization {
                org_id: self.org_id.clone()
            },
        }
    }
}

5.3 Access Control Integration

// aegis action patterns for fiberlb
const ACTIONS: &[&str] = &[
    "fiberlb:loadbalancers:create",
    "fiberlb:loadbalancers:get",
    "fiberlb:loadbalancers:list",
    "fiberlb:loadbalancers:update",
    "fiberlb:loadbalancers:delete",
    "fiberlb:listeners:create",
    "fiberlb:listeners:get",
    "fiberlb:listeners:list",
    "fiberlb:listeners:update",
    "fiberlb:listeners:delete",
    "fiberlb:pools:create",
    "fiberlb:pools:get",
    "fiberlb:pools:list",
    "fiberlb:pools:update",
    "fiberlb:pools:delete",
    "fiberlb:backends:add",
    "fiberlb:backends:get",
    "fiberlb:backends:list",
    "fiberlb:backends:update",
    "fiberlb:backends:remove",
    "fiberlb:backends:drain",
];

// Resource path format
// org/{org_id}/project/{project_id}/lb/{lb_id}
// org/{org_id}/project/{project_id}/lb/{lb_id}/listener/{listener_id}
// org/{org_id}/project/{project_id}/lb/{lb_id}/pool/{pool_id}
// org/{org_id}/project/{project_id}/lb/{lb_id}/pool/{pool_id}/backend/{backend_id}

async fn authorize_lb_access(
    iam: &IamClient,
    principal: &PrincipalRef,
    action: &str,
    lb: &LoadBalancer,
) -> Result<()> {
    let resource = ResourceRef {
        kind: "loadbalancer".into(),
        id: lb.id.clone(),
        org_id: lb.org_id.clone(),
        project_id: lb.project_id.clone().unwrap_or_default(),
        ..Default::default()
    };

    let allowed = iam.authorize(principal, action, &resource).await?;
    if !allowed {
        return Err(Error::AccessDenied);
    }
    Ok(())
}

5.4 Resource Isolation

  • Load balancers with same name can exist in different orgs/projects
  • Backends can only reference VMs within same org/project scope
  • Cross-project backend references require explicit binding
  • Listener ports unique within a load balancer

6. Health Checking

6.1 Health Check Engine

pub struct HealthChecker {
    pools: Arc<DashMap<String, PoolState>>,
    check_interval: Duration,
    http_client: reqwest::Client,
    dns_client: Arc<FlashDnsClient>,
}

pub struct PoolState {
    pub pool: Pool,
    pub backends: DashMap<String, BackendState>,
}

pub struct BackendState {
    pub backend: Backend,
    pub health: HealthStatus,
    pub circuit: CircuitState,
    pub last_response_time: Option<Duration>,
}

impl HealthChecker {
    /// Start background health checking for all pools
    pub async fn start(&self);

    /// Run single health check for a backend
    pub async fn check_backend(
        &self,
        backend: &Backend,
        check: &HealthCheck,
    ) -> HealthCheckResult;

    /// Update backend health status based on check result
    pub fn update_health(
        &self,
        backend_id: &str,
        result: HealthCheckResult,
    );
}

pub struct HealthCheckResult {
    pub success: bool,
    pub response_time: Duration,
    pub error: Option<String>,
    pub details: HealthCheckDetails,
}

pub enum HealthCheckDetails {
    Tcp { connected: bool },
    Http { status_code: u16, body_preview: String },
    Dns { resolved: bool, addresses: Vec<IpAddr> },
}

6.2 Health Check Types

TCP Health Check:

async fn check_tcp(addr: SocketAddr, timeout: Duration) -> HealthCheckResult {
    match tokio::time::timeout(timeout, TcpStream::connect(addr)).await {
        Ok(Ok(_)) => HealthCheckResult::success(),
        Ok(Err(e)) => HealthCheckResult::failure(format!("Connection failed: {}", e)),
        Err(_) => HealthCheckResult::failure("Connection timeout"),
    }
}

HTTP Health Check:

async fn check_http(
    url: &str,
    method: HttpMethod,
    headers: &HashMap<String, String>,
    expected_codes: &[u16],
    timeout: Duration,
) -> HealthCheckResult {
    let response = http_client
        .request(method, url)
        .headers(headers)
        .timeout(timeout)
        .send()
        .await?;

    let status = response.status().as_u16();
    if expected_codes.contains(&status) {
        HealthCheckResult::success_with_details(HealthCheckDetails::Http {
            status_code: status,
            body_preview: response.text().await?.chars().take(100).collect(),
        })
    } else {
        HealthCheckResult::failure(format!("Unexpected status: {}", status))
    }
}

DNS Health Check (via FlashDNS):

async fn check_dns(
    hostname: &str,
    record_type: DnsRecordType,
    expected: Option<&str>,
    dns_client: &FlashDnsClient,
) -> HealthCheckResult {
    let records = dns_client.resolve(hostname, record_type).await?;

    if records.is_empty() {
        return HealthCheckResult::failure("No DNS records found");
    }

    if let Some(expected_addr) = expected {
        if records.iter().any(|r| r.address() == expected_addr) {
            HealthCheckResult::success()
        } else {
            HealthCheckResult::failure(format!("Expected {} not found", expected_addr))
        }
    } else {
        HealthCheckResult::success()
    }
}

6.3 Circuit Breaker Implementation

impl CircuitBreaker {
    pub fn record_success(&mut self) {
        self.consecutive_successes += 1;
        self.consecutive_failures = 0;

        match self.state {
            CircuitState::HalfOpen => {
                if self.consecutive_successes >= self.config.success_threshold {
                    self.transition_to(CircuitState::Closed);
                }
            }
            CircuitState::Closed => {
                // Reset failure rate window
            }
            CircuitState::Open => {
                // Shouldn't happen, but handle gracefully
            }
        }
    }

    pub fn record_failure(&mut self) {
        self.consecutive_failures += 1;
        self.consecutive_successes = 0;

        match self.state {
            CircuitState::Closed => {
                if self.should_open() {
                    self.transition_to(CircuitState::Open);
                }
            }
            CircuitState::HalfOpen => {
                self.transition_to(CircuitState::Open);
            }
            CircuitState::Open => {
                // Already open
            }
        }
    }

    pub fn allow_request(&mut self) -> bool {
        match self.state {
            CircuitState::Closed => true,
            CircuitState::Open => {
                if self.timeout_elapsed() {
                    self.transition_to(CircuitState::HalfOpen);
                    true  // Allow probe request
                } else {
                    false
                }
            }
            CircuitState::HalfOpen => {
                // Limited requests allowed
                self.half_open_requests < self.config.half_open_max_requests
            }
        }
    }
}

6.4 PlasmaVMC Integration

/// Resolve backend address from PlasmaVMC VM ID
async fn resolve_vm_address(
    plasmavmc: &PlasmaVmcClient,
    vm_id: &str,
) -> Result<IpAddr> {
    let vm = plasmavmc.get_vm(vm_id).await?;

    // Prefer private IP for internal LB, public for external
    vm.network_interfaces
        .iter()
        .find_map(|nic| nic.private_ipv4)
        .ok_or(Error::NoBackendAddress)
}

/// Watch VM status changes for backend health
async fn watch_vm_health(
    plasmavmc: &PlasmaVmcClient,
    backend: &Backend,
) -> Result<()> {
    if let BackendAddress::VmId(vm_id) = &backend.address {
        let vm = plasmavmc.get_vm(vm_id).await?;
        match vm.status {
            VmStatus::Running => Ok(()),
            VmStatus::Stopped | VmStatus::Terminated => {
                Err(Error::BackendUnavailable)
            }
            _ => Err(Error::BackendUnknownState),
        }
    } else {
        Ok(())
    }
}

7. Storage

7.1 ChainFire Key Schema

Load Balancers:

fiberlb/lbs/{lb_id}                                  # LB record (by ID)
fiberlb/lbs/by-name/{org_id}/{name}                  # Name lookup (org-level)
fiberlb/lbs/by-name/{org_id}/{project_id}/{name}     # Name lookup (project-level)
fiberlb/lbs/by-org/{org_id}/{lb_id}                  # Org index
fiberlb/lbs/by-project/{project_id}/{lb_id}          # Project index

Listeners:

fiberlb/listeners/{listener_id}                      # Listener by ID
fiberlb/listeners/by-lb/{lb_id}/{listener_id}        # LB index
fiberlb/listeners/by-port/{lb_id}/{port}             # Port lookup

Pools:

fiberlb/pools/{pool_id}                              # Pool by ID
fiberlb/pools/by-lb/{lb_id}/{pool_id}                # LB index
fiberlb/pools/by-name/{lb_id}/{name}                 # Name lookup

Backends:

fiberlb/backends/{backend_id}                        # Backend by ID
fiberlb/backends/by-pool/{pool_id}/{backend_id}      # Pool index
fiberlb/backends/by-address/{address_hash}           # Address lookup

Health State (ephemeral):

fiberlb/health/{backend_id}                          # Current health status
fiberlb/circuit/{pool_id}                            # Circuit breaker state

7.2 Storage Operations

#[async_trait]
pub trait LbStore: Send + Sync {
    async fn create_lb(&self, lb: &LoadBalancer) -> Result<()>;
    async fn get_lb(&self, lb_id: &str) -> Result<Option<LoadBalancer>>;
    async fn get_lb_by_name(
        &self,
        org_id: &str,
        project_id: Option<&str>,
        name: &str,
    ) -> Result<Option<LoadBalancer>>;
    async fn update_lb(&self, lb: &LoadBalancer) -> Result<()>;
    async fn delete_lb(&self, lb_id: &str) -> Result<bool>;
    async fn list_lbs(
        &self,
        org_id: &str,
        project_id: Option<&str>,
        limit: usize,
        page_token: Option<&str>,
    ) -> Result<(Vec<LoadBalancer>, Option<String>)>;
}

#[async_trait]
pub trait PoolStore: Send + Sync {
    async fn create_pool(&self, pool: &Pool) -> Result<()>;
    async fn get_pool(&self, pool_id: &str) -> Result<Option<Pool>>;
    async fn update_pool(&self, pool: &Pool) -> Result<()>;
    async fn delete_pool(&self, pool_id: &str) -> Result<bool>;
    async fn list_pools_by_lb(
        &self,
        lb_id: &str,
        limit: usize,
        page_token: Option<&str>,
    ) -> Result<(Vec<Pool>, Option<String>)>;
}

#[async_trait]
pub trait BackendStore: Send + Sync {
    async fn add_backend(&self, backend: &Backend) -> Result<()>;
    async fn get_backend(&self, backend_id: &str) -> Result<Option<Backend>>;
    async fn update_backend(&self, backend: &Backend) -> Result<()>;
    async fn remove_backend(&self, backend_id: &str) -> Result<bool>;
    async fn list_backends_by_pool(
        &self,
        pool_id: &str,
        limit: usize,
        page_token: Option<&str>,
    ) -> Result<(Vec<Backend>, Option<String>)>;
    async fn get_healthy_backends(&self, pool_id: &str) -> Result<Vec<Backend>>;
}

7.3 Configuration Cache

pub struct ConfigCache {
    load_balancers: DashMap<String, LoadBalancer>,
    listeners: DashMap<String, Listener>,
    pools: DashMap<String, Pool>,
    backends: DashMap<String, Vec<Backend>>,  // pool_id -> backends
    config: CacheConfig,
}

impl ConfigCache {
    /// Load all config for an LB into cache
    pub async fn load_lb(&self, store: &dyn LbStore, lb_id: &str) -> Result<()>;

    /// Invalidate and reload on config change
    pub fn invalidate_lb(&self, lb_id: &str);

    /// Get routing config for data plane
    pub fn get_routing_config(&self, lb_id: &str) -> Option<RoutingConfig>;
}

8. Configuration

8.1 Config File Format (TOML)

[server]
grpc_addr = "0.0.0.0:6300"               # gRPC management API
metrics_addr = "0.0.0.0:9090"            # Prometheus metrics

[server.tls]
cert_file = "/etc/fiberlb/tls/server.crt"
key_file = "/etc/fiberlb/tls/server.key"
ca_file = "/etc/fiberlb/tls/ca.crt"

[storage]
backend = "chainfire"                     # "chainfire" | "memory"
chainfire_endpoints = ["http://chainfire-1:2379", "http://chainfire-2:2379"]

[proxy]
# L4 settings
tcp_keepalive_seconds = 60
tcp_nodelay = true
connection_timeout_ms = 5000

# L7 settings
http_idle_timeout_seconds = 60
max_header_size_bytes = 8192
max_body_size_bytes = 10485760           # 10MB

# Buffer sizes
recv_buffer_size = 65536
send_buffer_size = 65536

[proxy.tls]
default_min_version = "tls12"
session_cache_size = 10000
session_timeout_seconds = 3600

[health_check]
default_interval_seconds = 30
default_timeout_seconds = 10
default_healthy_threshold = 2
default_unhealthy_threshold = 3
max_concurrent_checks = 100
check_jitter_percent = 10                # Spread checks over interval

[circuit_breaker]
default_failure_threshold = 5
default_success_threshold = 3
default_timeout_seconds = 60

[iam]
endpoint = "http://aegis:9090"
service_account = "fiberlb"
token_path = "/var/run/secrets/iam/token"

[plasmavmc]
endpoint = "http://plasmavmc:8080"
cache_ttl_seconds = 30

[flashdns]
endpoint = "http://flashdns:5300"
dns_addr = "127.0.0.1:53"                # For DNS health checks

[logging]
level = "info"
format = "json"

8.2 Environment Variables

Variable Default Description
FIBERLB_CONFIG - Config file path
FIBERLB_GRPC_ADDR 0.0.0.0:6300 gRPC listen address
FIBERLB_METRICS_ADDR 0.0.0.0:9090 Metrics listen address
FIBERLB_LOG_LEVEL info Log level
FIBERLB_STORE_BACKEND memory Storage backend

8.3 CLI Arguments

fiberlb-server [OPTIONS]
  -c, --config <PATH>       Config file path
  --grpc-addr <ADDR>        gRPC listen address
  --metrics-addr <ADDR>     Metrics listen address
  -l, --log-level <LEVEL>   Log level
  -h, --help                Print help
  -V, --version             Print version

9. Security

9.1 Authentication

gRPC API:

  • aegis bearer tokens for user/service authentication
  • mTLS for service-to-service communication
  • API key header for programmatic access

Data Plane:

  • TLS termination at listener (for HTTPS)
  • mTLS to backends (optional)
  • Client certificate validation (optional)

9.2 Authorization

  • All management operations authorized via aegis
  • LB-level, pool-level, and backend-level permissions
  • Scope enforcement (org/project boundaries)
  • Owner-based access patterns supported

9.3 Data Security

  • TLS 1.2/1.3 for HTTPS listeners
  • Certificate storage in LightningStor (reference by ID)
  • Private keys never exposed via API
  • Backend traffic encryption (optional mTLS)

9.4 Network Security

pub struct SecurityConfig {
    /// Restrict listener binding to specific IPs
    pub allowed_listener_ips: Vec<IpAddr>,

    /// Restrict backend addresses to trusted ranges
    pub allowed_backend_cidrs: Vec<IpNetwork>,

    /// Maximum connections per source IP
    pub max_connections_per_ip: Option<u32>,

    /// Rate limiting for new connections
    pub connection_rate_limit: Option<RateLimit>,
}

9.5 Audit

  • All management API calls logged with principal, action, resource
  • Connection logs for traffic analytics
  • Health check results logged
  • Integration with platform audit system

10. Operations

10.1 Deployment

Single Node (Development):

fiberlb-server --config config.toml

Production Cluster:

# Multiple FiberLB instances
# - Stateless control plane (shared ChainFire)
# - Data plane with health state sync
fiberlb-server --config config.toml

# Behind external load balancer for HA
# Or with BGP anycast for direct traffic distribution

10.2 Monitoring

Metrics (Prometheus):

Metric Type Description
fiberlb_connections_total Counter Total connections accepted
fiberlb_connections_active Gauge Current active connections
fiberlb_requests_total Counter Total L7 requests
fiberlb_request_duration_seconds Histogram Request latency
fiberlb_bytes_in_total Counter Total bytes received
fiberlb_bytes_out_total Counter Total bytes sent
fiberlb_backend_health{status} Gauge Backends by health status
fiberlb_health_checks_total Counter Total health checks
fiberlb_health_check_duration_seconds Histogram Health check latency
fiberlb_circuit_breaker_state Gauge Circuit breaker states
fiberlb_pools_total Gauge Total pools
fiberlb_backends_total Gauge Total backends
fiberlb_grpc_requests_total Counter gRPC API requests

Health Endpoints:

  • GET /health - Liveness check
  • GET /ready - Readiness check (storage connected, data plane ready)

10.3 Backup & Recovery

  • LB configuration: ChainFire snapshots
  • Export: Configuration export via gRPC API
  • Import: Configuration import for disaster recovery

10.4 Graceful Operations

/// Graceful backend removal
pub async fn drain_backend(backend_id: &str, timeout: Duration) -> Result<()> {
    // 1. Mark backend as draining
    backend.status = BackendStatus::Draining;

    // 2. Stop sending new connections
    routing.exclude_backend(backend_id);

    // 3. Wait for existing connections to complete
    let deadline = Instant::now() + timeout;
    while backend.active_connections() > 0 && Instant::now() < deadline {
        tokio::time::sleep(Duration::from_secs(1)).await;
    }

    // 4. Force-close remaining connections
    if backend.active_connections() > 0 {
        backend.force_close_connections();
    }

    // 5. Remove backend
    backend.status = BackendStatus::Removed;
    Ok(())
}

/// Graceful LB shutdown
pub async fn graceful_shutdown(&self, timeout: Duration) -> Result<()> {
    // 1. Stop accepting new connections
    self.listeners.stop_accepting();

    // 2. Drain all backends
    for pool in self.pools.values() {
        for backend in pool.backends.values() {
            self.drain_backend(&backend.id, timeout / 2).await?;
        }
    }

    // 3. Close management API
    self.grpc_server.shutdown().await;

    Ok(())
}

11. Compatibility

11.1 API Versioning

  • gRPC package: fiberlb.v1
  • Semantic versioning for breaking changes
  • Backward compatible additions within major version

11.2 Protocol Support

Protocol Version Status
HTTP 1.0, 1.1 Supported
HTTP 2 Supported
HTTP 3 (QUIC) Planned
TLS 1.2, 1.3 Supported
TCP - Supported
UDP - Supported
WebSocket - Supported (via HTTP upgrade)

11.3 Backend Compatibility

  • Direct IP addresses
  • DNS hostnames (with periodic re-resolution)
  • PlasmaVMC VM IDs (resolved via API)

Appendix

A. Error Codes

gRPC Errors:

Error Description
LB_NOT_FOUND Load balancer does not exist
LISTENER_NOT_FOUND Listener does not exist
POOL_NOT_FOUND Pool does not exist
BACKEND_NOT_FOUND Backend does not exist
LB_ALREADY_EXISTS LB name already in use
LISTENER_PORT_CONFLICT Port already in use on LB
INVALID_LB_NAME LB name format invalid
INVALID_BACKEND_ADDRESS Backend address invalid
ACCESS_DENIED Permission denied
POOL_NOT_EMPTY Cannot delete pool with backends
BACKEND_UNHEALTHY Backend failed health check
CIRCUIT_OPEN Circuit breaker is open
QUOTA_EXCEEDED LB/pool/backend quota exceeded

B. Port Assignments

Port Protocol Purpose
6300 gRPC Management API
9090 HTTP Prometheus metrics
80 HTTP Default HTTP listener (configurable)
443 HTTPS Default HTTPS listener (configurable)

C. Glossary

  • Load Balancer: A logical grouping of listeners and pools for traffic distribution
  • Listener: A network endpoint that accepts incoming traffic
  • Pool: A group of backend servers with a load balancing algorithm
  • Backend: An individual server that receives traffic from a pool
  • Health Check: Periodic probe to verify backend availability
  • Circuit Breaker: Pattern to prevent cascading failures by failing fast
  • Drain: Graceful removal of a backend by stopping new connections
  • L4: Layer 4 (transport layer) - TCP/UDP load balancing
  • L7: Layer 7 (application layer) - HTTP/HTTPS load balancing with routing
  • TLS Termination: Decrypting TLS at the load balancer
  • TLS Passthrough: Forwarding encrypted traffic directly to backends

D. Example Configurations

Basic HTTP Load Balancer:

// Create LB with HTTP listener and round-robin pool
let lb = client.create_load_balancer(CreateLoadBalancerRequest {
    name: "simple-http".into(),
    org_id: "acme".into(),
    project_id: Some("web".into()),
    ..Default::default()
}).await?;

let pool = client.create_pool(CreatePoolRequest {
    lb_id: lb.id.clone(),
    name: "backends".into(),
    algorithm: Algorithm::RoundRobin,
    health_check: Some(HealthCheck::http("/health", vec![200])),
    ..Default::default()
}).await?;

// Add backends
for addr in ["10.0.1.10", "10.0.1.11", "10.0.1.12"] {
    client.add_backend(AddBackendRequest {
        pool_id: pool.id.clone(),
        name: format!("backend-{}", addr),
        address: BackendAddress::Ip(addr.parse()?),
        port: 8080,
        ..Default::default()
    }).await?;
}

let listener = client.create_listener(CreateListenerRequest {
    lb_id: lb.id.clone(),
    name: "http".into(),
    protocol: ListenerProtocol::Http,
    port: 80,
    default_pool_id: Some(pool.id.clone()),
    ..Default::default()
}).await?;

HTTPS with Path-Based Routing:

// Create pools for different services
let api_pool = create_pool("api-pool", Algorithm::LeastConnections);
let static_pool = create_pool("static-pool", Algorithm::RoundRobin);

// HTTPS listener with routing rules
let https = client.create_listener(CreateListenerRequest {
    lb_id: lb.id.clone(),
    name: "https".into(),
    protocol: ListenerProtocol::Https,
    port: 443,
    tls_config: Some(TlsConfig {
        certificate_id: "cert-web-prod".into(),
        min_version: TlsVersion::Tls12,
        ..Default::default()
    }),
    default_pool_id: Some(static_pool.id.clone()),
    ..Default::default()
}).await?;

// Route /api/* to API pool
client.add_routing_rule(AddRoutingRuleRequest {
    listener_id: https.id.clone(),
    rule: RoutingRule {
        name: "api-route".into(),
        priority: 10,
        conditions: vec![
            RuleCondition::PathPrefix { value: "/api/".into() },
        ],
        action: RuleAction::ForwardToPool { pool_id: api_pool.id.clone() },
        enabled: true,
        ..Default::default()
    },
}).await?;

TCP Load Balancer (Database):

let lb = client.create_load_balancer(CreateLoadBalancerRequest {
    name: "postgres-lb".into(),
    org_id: "acme".into(),
    project_id: Some("data".into()),
    ..Default::default()
}).await?;

let pool = client.create_pool(CreatePoolRequest {
    lb_id: lb.id.clone(),
    name: "postgres-replicas".into(),
    algorithm: Algorithm::LeastConnections,
    health_check: Some(HealthCheck {
        check_type: HealthCheckType::Tcp {},
        interval: 10,
        timeout: 5,
        ..Default::default()
    }),
    ..Default::default()
}).await?;

let listener = client.create_listener(CreateListenerRequest {
    lb_id: lb.id.clone(),
    name: "postgres".into(),
    protocol: ListenerProtocol::Tcp,
    port: 5432,
    default_pool_id: Some(pool.id.clone()),
    ..Default::default()
}).await?;

E. Performance Considerations

  • Connection pooling: Reuse backend connections for L7
  • Zero-copy forwarding: Use splice/sendfile where possible for L4
  • Health check spreading: Jitter to avoid thundering herd
  • Circuit breaker: Fail fast to prevent cascade failures
  • Backend caching: Cache VM IP resolution from PlasmaVMC
  • Hot config reload: Update routing without connection drops