photoncloud-monorepo/specifications/fiberlb/S2-l7-loadbalancing-spec.md

24 KiB

T055.S2: L7 Load Balancing Design Specification

Author: PeerA Date: 2025-12-12 Status: DRAFT

1. Executive Summary

This document specifies the L7 (HTTP/HTTPS) load balancing implementation for FiberLB. The design extends the existing L4 TCP proxy with HTTP-aware routing, TLS termination, and policy-based backend selection.

2. Current State Analysis

2.1 Existing L7 Type Foundation

File: fiberlb-types/src/listener.rs

pub enum ListenerProtocol {
    Tcp,              // L4
    Udp,              // L4
    Http,             // L7 - exists but unused
    Https,            // L7 - exists but unused
    TerminatedHttps,  // L7 - exists but unused
}

pub struct TlsConfig {
    pub certificate_id: String,
    pub min_version: TlsVersion,
    pub cipher_suites: Vec<String>,
}

File: fiberlb-types/src/pool.rs

pub enum PoolProtocol {
    Tcp,   // L4
    Udp,   // L4
    Http,  // L7 - exists but unused
    Https, // L7 - exists but unused
}

pub enum PersistenceType {
    SourceIp,   // L4
    Cookie,     // L7 - exists but unused
    AppCookie,  // L7 - exists but unused
}

2.2 L4 DataPlane Architecture

File: fiberlb-server/src/dataplane.rs

Current architecture:

  • TCP proxy using tokio::net::TcpListener
  • Bidirectional copy via tokio::io::copy
  • Round-robin backend selection (Maglev ready but not integrated)

Gap: No HTTP parsing, no L7 routing rules, no TLS termination.

3. L7 Architecture Design

3.1 High-Level Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                          FiberLB Server                                  │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐│
│  │                        L7 Data Plane                                 ││
│  │                                                                      ││
│  │  ┌──────────────┐    ┌─────────────────┐    ┌──────────────────────┐││
│  │  │  TLS         │    │   HTTP Router   │    │   Backend Connector  │││
│  │  │  Termination │───>│   (Policy Eval) │───>│   (Connection Pool)  │││
│  │  │  (rustls)    │    │                 │    │                      │││
│  │  └──────────────┘    └─────────────────┘    └──────────────────────┘││
│  │          ▲                   │                        │              ││
│  │          │                   ▼                        ▼              ││
│  │  ┌───────┴──────┐    ┌─────────────────┐    ┌──────────────────────┐││
│  │  │  axum/hyper  │    │   L7Policy      │    │   Health Check       │││
│  │  │  HTTP Server │    │   Evaluator     │    │   Integration        │││
│  │  └──────────────┘    └─────────────────┘    └──────────────────────┘││
│  └─────────────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────────────┘

3.2 Technology Selection

Component Selection Rationale
HTTP Server axum Already in workspace, familiar API
TLS rustls via axum-server Pure Rust, no OpenSSL dependency
HTTP Client hyper Low-level control for proxy scenarios
Connection Pool hyper-util Efficient backend connection reuse

Alternative Considered: Cloudflare Pingora

  • Pros: High performance, battle-tested
  • Cons: Heavy dependency, different paradigm, learning curve
  • Decision: Start with axum/hyper, consider Pingora for v2 if perf insufficient

4. New Types

4.1 L7Policy

Content-based routing policy attached to a Listener.

// File: fiberlb-types/src/l7policy.rs

/// Unique identifier for an L7 policy
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Serialize, Deserialize)]
pub struct L7PolicyId(Uuid);

/// L7 routing policy
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct L7Policy {
    pub id: L7PolicyId,
    pub listener_id: ListenerId,
    pub name: String,

    /// Evaluation order (lower = higher priority)
    pub position: u32,

    /// Action to take when rules match
    pub action: L7PolicyAction,

    /// Redirect URL (for RedirectToUrl action)
    pub redirect_url: Option<String>,

    /// Target pool (for RedirectToPool action)
    pub redirect_pool_id: Option<PoolId>,

    /// HTTP status code for redirects/rejects
    pub redirect_http_status_code: Option<u16>,

    pub enabled: bool,
    pub created_at: u64,
    pub updated_at: u64,
}

/// Policy action when rules match
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum L7PolicyAction {
    /// Route to a specific pool
    RedirectToPool,
    /// Return HTTP redirect to URL
    RedirectToUrl,
    /// Reject request with status code
    Reject,
}

4.2 L7Rule

Match conditions for L7Policy evaluation.

// File: fiberlb-types/src/l7rule.rs

/// Unique identifier for an L7 rule
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Serialize, Deserialize)]
pub struct L7RuleId(Uuid);

/// L7 routing rule (match condition)
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct L7Rule {
    pub id: L7RuleId,
    pub policy_id: L7PolicyId,

    /// Type of comparison
    pub rule_type: L7RuleType,

    /// Comparison operator
    pub compare_type: L7CompareType,

    /// Value to compare against
    pub value: String,

    /// Key for header/cookie rules
    pub key: Option<String>,

    /// Invert the match result
    pub invert: bool,

    pub created_at: u64,
    pub updated_at: u64,
}

/// What to match against
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum L7RuleType {
    /// Match request hostname (Host header or SNI)
    HostName,
    /// Match request path
    Path,
    /// Match file extension (e.g., .jpg, .css)
    FileType,
    /// Match HTTP header value
    Header,
    /// Match cookie value
    Cookie,
    /// Match SSL SNI hostname
    SslConnSnI,
}

/// How to compare
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum L7CompareType {
    /// Exact match
    EqualTo,
    /// Regex match
    Regex,
    /// String starts with
    StartsWith,
    /// String ends with
    EndsWith,
    /// String contains
    Contains,
}

5. L7DataPlane Implementation

5.1 Module Structure

fiberlb-server/src/
├── dataplane.rs      (L4 - existing)
├── l7_dataplane.rs   (NEW - L7 HTTP proxy)
├── l7_router.rs      (NEW - Policy/Rule evaluation)
├── tls.rs            (NEW - TLS configuration)
└── maglev.rs         (existing)

5.2 L7DataPlane Core

// File: fiberlb-server/src/l7_dataplane.rs

use axum::{Router, extract::State, http::Request, body::Body};
use hyper_util::client::legacy::Client;
use hyper_util::rt::TokioExecutor;
use tower::ServiceExt;

/// L7 HTTP/HTTPS Data Plane
pub struct L7DataPlane {
    metadata: Arc<LbMetadataStore>,
    router: Arc<L7Router>,
    http_client: Client<HttpConnector, Body>,
    listeners: Arc<RwLock<HashMap<ListenerId, L7ListenerHandle>>>,
}

impl L7DataPlane {
    pub fn new(metadata: Arc<LbMetadataStore>) -> Self {
        let http_client = Client::builder(TokioExecutor::new())
            .pool_max_idle_per_host(32)
            .build_http();

        Self {
            metadata: metadata.clone(),
            router: Arc::new(L7Router::new(metadata)),
            http_client,
            listeners: Arc::new(RwLock::new(HashMap::new())),
        }
    }

    /// Start an HTTP/HTTPS listener
    pub async fn start_listener(&self, listener_id: ListenerId) -> Result<()> {
        let listener = self.find_listener(&listener_id).await?;

        let app = self.build_router(&listener).await?;

        let bind_addr = format!("0.0.0.0:{}", listener.port);

        match listener.protocol {
            ListenerProtocol::Http => {
                self.start_http_server(listener_id, &bind_addr, app).await
            }
            ListenerProtocol::Https | ListenerProtocol::TerminatedHttps => {
                let tls_config = listener.tls_config
                    .ok_or(L7Error::TlsConfigMissing)?;
                self.start_https_server(listener_id, &bind_addr, app, tls_config).await
            }
            _ => Err(L7Error::InvalidProtocol),
        }
    }

    /// Build axum router for a listener
    async fn build_router(&self, listener: &Listener) -> Result<Router> {
        let state = ProxyState {
            metadata: self.metadata.clone(),
            router: self.router.clone(),
            http_client: self.http_client.clone(),
            listener_id: listener.id,
            default_pool_id: listener.default_pool_id,
        };

        Ok(Router::new()
            .fallback(proxy_handler)
            .with_state(state))
    }
}

/// Proxy request handler
async fn proxy_handler(
    State(state): State<ProxyState>,
    request: Request<Body>,
) -> impl IntoResponse {
    // 1. Evaluate L7 policies to determine target pool
    let routing_result = state.router
        .evaluate(&state.listener_id, &request)
        .await;

    match routing_result {
        RoutingResult::Pool(pool_id) => {
            proxy_to_pool(&state, pool_id, request).await
        }
        RoutingResult::Redirect { url, status } => {
            Redirect::to(&url).into_response()
        }
        RoutingResult::Reject { status } => {
            StatusCode::from_u16(status)
                .unwrap_or(StatusCode::FORBIDDEN)
                .into_response()
        }
        RoutingResult::Default => {
            match state.default_pool_id {
                Some(pool_id) => proxy_to_pool(&state, pool_id, request).await,
                None => StatusCode::SERVICE_UNAVAILABLE.into_response(),
            }
        }
    }
}

5.3 L7Router (Policy Evaluation)

// File: fiberlb-server/src/l7_router.rs

/// L7 routing engine
pub struct L7Router {
    metadata: Arc<LbMetadataStore>,
}

impl L7Router {
    /// Evaluate policies for a request
    pub async fn evaluate(
        &self,
        listener_id: &ListenerId,
        request: &Request<Body>,
    ) -> RoutingResult {
        // Load policies ordered by position
        let policies = self.metadata
            .list_l7_policies(listener_id)
            .await
            .unwrap_or_default();

        for policy in policies.iter().filter(|p| p.enabled) {
            // Load rules for this policy
            let rules = self.metadata
                .list_l7_rules(&policy.id)
                .await
                .unwrap_or_default();

            // All rules must match (AND logic)
            if rules.iter().all(|rule| self.evaluate_rule(rule, request)) {
                return self.apply_policy_action(policy);
            }
        }

        RoutingResult::Default
    }

    /// Evaluate a single rule
    fn evaluate_rule(&self, rule: &L7Rule, request: &Request<Body>) -> bool {
        let value = match rule.rule_type {
            L7RuleType::HostName => {
                request.headers()
                    .get("host")
                    .and_then(|v| v.to_str().ok())
                    .map(|s| s.to_string())
            }
            L7RuleType::Path => {
                Some(request.uri().path().to_string())
            }
            L7RuleType::FileType => {
                request.uri().path()
                    .rsplit('.')
                    .next()
                    .map(|s| s.to_string())
            }
            L7RuleType::Header => {
                rule.key.as_ref().and_then(|key| {
                    request.headers()
                        .get(key)
                        .and_then(|v| v.to_str().ok())
                        .map(|s| s.to_string())
                })
            }
            L7RuleType::Cookie => {
                self.extract_cookie(request, rule.key.as_deref())
            }
            L7RuleType::SslConnSnI => {
                // SNI extracted during TLS handshake, stored in extension
                request.extensions()
                    .get::<SniHostname>()
                    .map(|s| s.0.clone())
            }
        };

        let matched = match value {
            Some(v) => self.compare(&v, &rule.value, rule.compare_type),
            None => false,
        };

        if rule.invert { !matched } else { matched }
    }

    fn compare(&self, value: &str, pattern: &str, compare_type: L7CompareType) -> bool {
        match compare_type {
            L7CompareType::EqualTo => value == pattern,
            L7CompareType::StartsWith => value.starts_with(pattern),
            L7CompareType::EndsWith => value.ends_with(pattern),
            L7CompareType::Contains => value.contains(pattern),
            L7CompareType::Regex => {
                regex::Regex::new(pattern)
                    .map(|r| r.is_match(value))
                    .unwrap_or(false)
            }
        }
    }
}

6. TLS Termination

6.1 Certificate Management

// File: fiberlb-types/src/certificate.rs

/// TLS Certificate
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Certificate {
    pub id: CertificateId,
    pub loadbalancer_id: LoadBalancerId,
    pub name: String,

    /// PEM-encoded certificate chain
    pub certificate: String,

    /// PEM-encoded private key (encrypted at rest)
    pub private_key: String,

    /// Certificate type
    pub cert_type: CertificateType,

    /// Expiration timestamp
    pub expires_at: u64,

    pub created_at: u64,
    pub updated_at: u64,
}

#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum CertificateType {
    /// Standard certificate
    Server,
    /// CA certificate for client auth
    ClientCa,
    /// SNI certificate
    Sni,
}

6.2 TLS Configuration

// File: fiberlb-server/src/tls.rs

use rustls::{ServerConfig, Certificate, PrivateKey};
use rustls_pemfile::{certs, pkcs8_private_keys};

pub fn build_tls_config(
    cert_pem: &str,
    key_pem: &str,
    min_version: TlsVersion,
) -> Result<ServerConfig> {
    let certs = certs(&mut cert_pem.as_bytes())?
        .into_iter()
        .map(Certificate)
        .collect();

    let keys = pkcs8_private_keys(&mut key_pem.as_bytes())?;
    let key = PrivateKey(keys.into_iter().next()
        .ok_or(TlsError::NoPrivateKey)?);

    let mut config = ServerConfig::builder()
        .with_safe_defaults()
        .with_no_client_auth()
        .with_single_cert(certs, key)?;

    // Set minimum TLS version
    config.versions = match min_version {
        TlsVersion::Tls12 => &[&rustls::version::TLS12, &rustls::version::TLS13],
        TlsVersion::Tls13 => &[&rustls::version::TLS13],
    };

    Ok(config)
}

/// SNI-based certificate resolver for multiple domains
pub struct SniCertResolver {
    certs: HashMap<String, Arc<ServerConfig>>,
    default: Arc<ServerConfig>,
}

impl ResolvesServerCert for SniCertResolver {
    fn resolve(&self, client_hello: ClientHello) -> Option<Arc<CertifiedKey>> {
        let sni = client_hello.server_name()?;
        self.certs.get(sni)
            .or(Some(&self.default))
            .map(|config| config.cert_resolver.resolve(client_hello))
            .flatten()
    }
}

7. Session Persistence (L7)

impl L7DataPlane {
    /// Add session persistence cookie to response
    fn add_persistence_cookie(
        &self,
        response: &mut Response<Body>,
        persistence: &SessionPersistence,
        backend_id: &str,
    ) {
        if persistence.persistence_type != PersistenceType::Cookie {
            return;
        }

        let cookie_name = persistence.cookie_name
            .as_deref()
            .unwrap_or("SERVERID");

        let cookie_value = format!(
            "{}={}; Max-Age={}; Path=/; HttpOnly",
            cookie_name,
            backend_id,
            persistence.timeout_seconds
        );

        response.headers_mut().append(
            "Set-Cookie",
            HeaderValue::from_str(&cookie_value).unwrap(),
        );
    }

    /// Extract backend from persistence cookie
    fn get_persistent_backend(
        &self,
        request: &Request<Body>,
        persistence: &SessionPersistence,
    ) -> Option<String> {
        let cookie_name = persistence.cookie_name
            .as_deref()
            .unwrap_or("SERVERID");

        request.headers()
            .get("cookie")
            .and_then(|v| v.to_str().ok())
            .and_then(|cookies| {
                cookies.split(';')
                    .find_map(|c| {
                        let parts: Vec<_> = c.trim().splitn(2, '=').collect();
                        if parts.len() == 2 && parts[0] == cookie_name {
                            Some(parts[1].to_string())
                        } else {
                            None
                        }
                    })
            })
    }
}

8. Health Checks (L7)

8.1 HTTP Health Check

// Extend existing health check for L7

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct HttpHealthCheck {
    /// HTTP method (GET, HEAD, POST)
    pub method: String,
    /// URL path to check
    pub url_path: String,
    /// Expected HTTP status codes (e.g., [200, 201, 204])
    pub expected_codes: Vec<u16>,
    /// Host header to send
    pub host_header: Option<String>,
}

impl HealthChecker {
    async fn check_http_backend(&self, backend: &Backend, config: &HttpHealthCheck) -> bool {
        let url = format!("http://{}:{}{}", backend.address, backend.port, config.url_path);

        let request = Request::builder()
            .method(config.method.as_str())
            .uri(&url)
            .header("Host", config.host_header.as_deref().unwrap_or(&backend.address))
            .body(Body::empty())
            .unwrap();

        match self.http_client.request(request).await {
            Ok(response) => {
                config.expected_codes.contains(&response.status().as_u16())
            }
            Err(_) => false,
        }
    }
}

9. Integration Points

9.1 Server Integration

// File: fiberlb-server/src/server.rs

impl FiberLBServer {
    pub async fn run(&self) -> Result<()> {
        let l4_dataplane = DataPlane::new(self.metadata.clone());
        let l7_dataplane = L7DataPlane::new(self.metadata.clone());

        // Watch for listener changes
        tokio::spawn(async move {
            // Start L4 listeners (TCP/UDP)
            // Start L7 listeners (HTTP/HTTPS)
        });

        // Run gRPC control plane
        // ...
    }
}

9.2 gRPC API Extensions

// Additions to fiberlb.proto

message L7Policy {
  string id = 1;
  string listener_id = 2;
  string name = 3;
  uint32 position = 4;
  L7PolicyAction action = 5;
  optional string redirect_url = 6;
  optional string redirect_pool_id = 7;
  optional uint32 redirect_http_status_code = 8;
  bool enabled = 9;
}

message L7Rule {
  string id = 1;
  string policy_id = 2;
  L7RuleType rule_type = 3;
  L7CompareType compare_type = 4;
  string value = 5;
  optional string key = 6;
  bool invert = 7;
}

service FiberLBService {
  // Existing methods...

  // L7 Policy management
  rpc CreateL7Policy(CreateL7PolicyRequest) returns (CreateL7PolicyResponse);
  rpc GetL7Policy(GetL7PolicyRequest) returns (GetL7PolicyResponse);
  rpc ListL7Policies(ListL7PoliciesRequest) returns (ListL7PoliciesResponse);
  rpc UpdateL7Policy(UpdateL7PolicyRequest) returns (UpdateL7PolicyResponse);
  rpc DeleteL7Policy(DeleteL7PolicyRequest) returns (DeleteL7PolicyResponse);

  // L7 Rule management
  rpc CreateL7Rule(CreateL7RuleRequest) returns (CreateL7RuleResponse);
  rpc GetL7Rule(GetL7RuleRequest) returns (GetL7RuleResponse);
  rpc ListL7Rules(ListL7RulesRequest) returns (ListL7RulesResponse);
  rpc UpdateL7Rule(UpdateL7RuleRequest) returns (UpdateL7RuleResponse);
  rpc DeleteL7Rule(DeleteL7RuleRequest) returns (DeleteL7RuleResponse);

  // Certificate management
  rpc CreateCertificate(CreateCertificateRequest) returns (CreateCertificateResponse);
  rpc GetCertificate(GetCertificateRequest) returns (GetCertificateResponse);
  rpc ListCertificates(ListCertificatesRequest) returns (ListCertificatesResponse);
  rpc DeleteCertificate(DeleteCertificateRequest) returns (DeleteCertificateResponse);
}

10. Implementation Plan

Phase 1: Types & Storage (Day 1)

  1. Add L7Policy, L7Rule, Certificate types to fiberlb-types
  2. Add protobuf definitions
  3. Implement metadata storage for L7 policies

Phase 2: L7DataPlane (Day 1-2)

  1. Create l7_dataplane.rs with axum-based HTTP server
  2. Implement basic HTTP proxy (no routing)
  3. Add connection pooling to backends

Phase 3: TLS Termination (Day 2)

  1. Implement TLS configuration building
  2. Add SNI-based certificate selection
  3. HTTPS listener support

Phase 4: L7 Routing (Day 2-3)

  1. Implement L7Router policy evaluation
  2. Add all rule types (Host, Path, Header, Cookie)
  3. Cookie-based session persistence

Phase 5: API & Integration (Day 3)

  1. gRPC API for L7Policy/L7Rule CRUD
  2. REST API endpoints
  3. Integration with control plane

11. Configuration Example

# Example: Route /api/* to api-pool, /static/* to cdn-pool
listeners:
  - name: https-frontend
    port: 443
    protocol: https
    tls_config:
      certificate_id: cert-main
      min_version: tls12
    default_pool_id: default-pool

l7_policies:
  - name: api-routing
    listener_id: https-frontend
    position: 10
    action: redirect_to_pool
    redirect_pool_id: api-pool
    rules:
      - rule_type: path
        compare_type: starts_with
        value: "/api/"

  - name: static-routing
    listener_id: https-frontend
    position: 20
    action: redirect_to_pool
    redirect_pool_id: cdn-pool
    rules:
      - rule_type: path
        compare_type: regex
        value: "\\.(js|css|png|jpg|svg)$"

12. Dependencies

Add to fiberlb-server/Cargo.toml:

[dependencies]
# HTTP/TLS
axum = { version = "0.8", features = ["http2"] }
axum-server = { version = "0.7", features = ["tls-rustls"] }
hyper = { version = "1.0", features = ["full"] }
hyper-util = { version = "0.1", features = ["client", "client-legacy", "http1", "http2"] }
rustls = "0.23"
rustls-pemfile = "2.0"
tokio-rustls = "0.26"

# Routing
regex = "1.10"

13. Decision Summary

Aspect Decision Rationale
HTTP Framework axum Consistent with other services, familiar API
TLS Library rustls Pure Rust, no OpenSSL complexity
L7 Routing Policy/Rule model OpenStack Octavia-compatible, flexible
Certificate Storage ChainFire Consistent with metadata, encrypted at rest
Session Persistence Cookie-based Standard approach for L7

14. References