12 KiB
12 KiB
T057.S1: IPAM System Design Specification
Author: PeerA Date: 2025-12-12 Status: DRAFT
1. Executive Summary
This document specifies the IPAM (IP Address Management) system for k8shost integration with PrismNET. The design extends PrismNET's existing IPAM capabilities to support Kubernetes Service ClusterIP and LoadBalancer IP allocation.
2. Current State Analysis
2.1 k8shost Service IP Allocation (Current)
File: k8shost/crates/k8shost-server/src/services/service.rs:28-37
pub fn allocate_cluster_ip() -> String {
// Simple counter-based allocation in 10.96.0.0/16
static COUNTER: AtomicU32 = AtomicU32::new(100);
let counter = COUNTER.fetch_add(1, Ordering::SeqCst);
format!("10.96.{}.{}", (counter >> 8) & 0xff, counter & 0xff)
}
Issues:
- No persistence (counter resets on restart)
- No collision detection
- No integration with network layer
- Hard-coded CIDR range
2.2 PrismNET IPAM (Current)
File: prismnet/crates/prismnet-server/src/metadata.rs:577-662
Capabilities:
- CIDR parsing and IP enumeration
- Allocated IP tracking via Port resources
- Gateway IP avoidance
- Subnet-scoped allocation
- ChainFire persistence
Limitations:
- Designed for VM/container ports, not K8s Services
- No dedicated Service IP subnet concept
3. Architecture Design
3.1 Conceptual Model
┌─────────────────────────────────────────────────────────────┐
│ Tenant Scope │
│ │
│ ┌────────────────┐ ┌────────────────┐ │
│ │ VPC │ │ Service Subnet │ │
│ │ (10.0.0.0/16) │ │ (10.96.0.0/16) │ │
│ └───────┬────────┘ └───────┬─────────┘ │
│ │ │ │
│ ┌───────┴────────┐ ┌───────┴─────────┐ │
│ │ Subnet │ │ Service IPs │ │
│ │ (10.0.1.0/24) │ │ ClusterIP │ │
│ └───────┬────────┘ │ LoadBalancerIP │ │
│ │ └─────────────────┘ │
│ ┌───────┴────────┐ │
│ │ Ports (VMs) │ │
│ └────────────────┘ │
└─────────────────────────────────────────────────────────────┘
3.2 New Resource: ServiceIPPool
A dedicated IP pool for Kubernetes Services within a tenant.
/// Service IP Pool for k8shost Service allocation
pub struct ServiceIPPool {
pub id: ServiceIPPoolId,
pub org_id: String,
pub project_id: String,
pub name: String,
pub cidr_block: String, // e.g., "10.96.0.0/16"
pub pool_type: ServiceIPPoolType,
pub allocated_ips: HashSet<String>,
pub created_at: u64,
pub updated_at: u64,
}
pub enum ServiceIPPoolType {
ClusterIP, // For ClusterIP services
LoadBalancer, // For LoadBalancer services (VIPs)
NodePort, // Reserved NodePort range
}
3.3 Integration Architecture
┌──────────────────────────────────────────────────────────────────┐
│ k8shost Server │
│ │
│ ┌─────────────────────┐ ┌──────────────────────┐ │
│ │ ServiceService │─────>│ IpamClient │ │
│ │ create_service() │ │ allocate_ip() │ │
│ │ delete_service() │ │ release_ip() │ │
│ └─────────────────────┘ └──────────┬───────────┘ │
└──────────────────────────────────────────┼───────────────────────┘
│ gRPC
┌──────────────────────────────────────────┼───────────────────────┐
│ PrismNET Server │ │
│ ▼ │
│ ┌─────────────────────┐ ┌──────────────────────┐ │
│ │ IpamService (new) │<─────│ NetworkMetadataStore│ │
│ │ AllocateServiceIP │ │ service_ip_pools │ │
│ │ ReleaseServiceIP │ │ allocated_ips │ │
│ └─────────────────────┘ └──────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
4. API Design
4.1 PrismNET IPAM gRPC Service
service IpamService {
// Create a Service IP Pool
rpc CreateServiceIPPool(CreateServiceIPPoolRequest)
returns (CreateServiceIPPoolResponse);
// Get Service IP Pool
rpc GetServiceIPPool(GetServiceIPPoolRequest)
returns (GetServiceIPPoolResponse);
// List Service IP Pools
rpc ListServiceIPPools(ListServiceIPPoolsRequest)
returns (ListServiceIPPoolsResponse);
// Allocate IP from pool
rpc AllocateServiceIP(AllocateServiceIPRequest)
returns (AllocateServiceIPResponse);
// Release IP back to pool
rpc ReleaseServiceIP(ReleaseServiceIPRequest)
returns (ReleaseServiceIPResponse);
// Get IP allocation status
rpc GetIPAllocation(GetIPAllocationRequest)
returns (GetIPAllocationResponse);
}
message AllocateServiceIPRequest {
string org_id = 1;
string project_id = 2;
string pool_id = 3; // Optional: specific pool
ServiceIPPoolType pool_type = 4; // Required: ClusterIP or LoadBalancer
string service_uid = 5; // K8s service UID for tracking
string requested_ip = 6; // Optional: specific IP request
}
message AllocateServiceIPResponse {
string ip_address = 1;
string pool_id = 2;
}
4.2 k8shost IpamClient
/// IPAM client for k8shost
pub struct IpamClient {
client: IpamServiceClient<Channel>,
}
impl IpamClient {
/// Allocate ClusterIP for a Service
pub async fn allocate_cluster_ip(
&mut self,
org_id: &str,
project_id: &str,
service_uid: &str,
) -> Result<String>;
/// Allocate LoadBalancer IP for a Service
pub async fn allocate_loadbalancer_ip(
&mut self,
org_id: &str,
project_id: &str,
service_uid: &str,
) -> Result<String>;
/// Release an allocated IP
pub async fn release_ip(
&mut self,
org_id: &str,
project_id: &str,
ip_address: &str,
) -> Result<()>;
}
5. Storage Schema
5.1 ChainFire Key Structure
/prismnet/ipam/pools/{org_id}/{project_id}/{pool_id}
/prismnet/ipam/allocations/{org_id}/{project_id}/{ip_address}
5.2 Allocation Record
pub struct IPAllocation {
pub ip_address: String,
pub pool_id: ServiceIPPoolId,
pub org_id: String,
pub project_id: String,
pub resource_type: String, // "k8s-service", "vm-port", etc.
pub resource_id: String, // Service UID, Port ID, etc.
pub allocated_at: u64,
}
6. Implementation Plan
Phase 1: PrismNET IPAM Service (S1 deliverable)
- Add
ServiceIPPooltype to prismnet-types - Add
IpamServicegRPC service to prismnet-api - Implement
IpamServiceImplin prismnet-server - Storage: pools and allocations in ChainFire
Phase 2: k8shost Integration (S2)
- Create
IpamClientin k8shost - Replace
allocate_cluster_ip()with PrismNET call - Add IP release on Service deletion
- Configuration: PrismNET endpoint env var
Phase 3: Default Pool Provisioning
- Auto-create default ClusterIP pool per tenant
- Default CIDR:
10.96.{tenant_hash}.0/20(4096 IPs) - LoadBalancer pool:
192.168.{tenant_hash}.0/24(256 IPs)
7. Tenant Isolation
7.1 Pool Isolation
Each tenant (org_id + project_id) has:
- Separate ClusterIP pool
- Separate LoadBalancer pool
- Non-overlapping IP ranges
7.2 IP Collision Prevention
- IP uniqueness enforced at pool level
- CAS (Compare-And-Swap) for concurrent allocation
- ChainFire transactions for atomicity
8. Default Configuration
# k8shost config
ipam:
enabled: true
prismnet_endpoint: "http://prismnet:9090"
# Default pools (auto-created if missing)
default_cluster_ip_cidr: "10.96.0.0/12" # 1M IPs shared
default_loadbalancer_cidr: "192.168.0.0/16" # 64K IPs shared
# Per-tenant allocation
cluster_ip_pool_size: "/20" # 4096 IPs per tenant
loadbalancer_pool_size: "/24" # 256 IPs per tenant
9. Backward Compatibility
9.1 Migration Path
- Deploy new IPAM service in PrismNET
- k8shost checks for IPAM availability on startup
- If IPAM unavailable, fall back to local counter
- Log warning for fallback mode
9.2 Existing Services
- Existing Services retain their IPs
- On next restart, k8shost syncs with IPAM
- Conflict resolution: IPAM is source of truth
10. Observability
10.1 Metrics
# Pool utilization
prismnet_ipam_pool_total{org_id, project_id, pool_type}
prismnet_ipam_pool_allocated{org_id, project_id, pool_type}
prismnet_ipam_pool_available{org_id, project_id, pool_type}
# Allocation rate
prismnet_ipam_allocations_total{org_id, project_id, pool_type}
prismnet_ipam_releases_total{org_id, project_id, pool_type}
10.2 Alerts
- Pool exhaustion warning at 80% utilization
- Allocation failure alerts
- Pool not found errors
11. References
- Kubernetes Service IP allocation
- OpenStack Neutron IPAM
- PrismNET metadata.rs IPAM implementation
12. Decision Summary
| Aspect | Decision | Rationale |
|---|---|---|
| IPAM Location | PrismNET | Network layer owns IP management |
| Storage | ChainFire | Consistency with existing PrismNET storage |
| Pool Type | Per-tenant | Tenant isolation, quota enforcement |
| Integration | gRPC client | Consistent with other PlasmaCloud services |
| Fallback | Local counter | Backward compatibility |