photoncloud-monorepo/specifications/overlay-network/README.md
centra 5c6eb04a46 T036: Add VM cluster deployment configs for nixos-anywhere
- netboot-base.nix with SSH key auth
- Launch scripts for node01/02/03
- Node configuration.nix and disko.nix
- Nix modules for first-boot automation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 09:59:19 +09:00

24 KiB

Overlay Network Specification

Version: 1.0 | Status: Draft | Last Updated: 2025-12-08

1. Overview

1.1 Purpose

Overlay Network provides multi-tenant network isolation for PlasmaVMC virtual machines. It enables secure, isolated network environments per organization and project using OVN (Open Virtual Network) as the underlying network virtualization platform.

The overlay network abstracts physical network infrastructure and provides logical networking constructs (VPCs, subnets, security groups) that ensure complete tenant isolation while maintaining flexibility for inter-tenant communication when explicitly configured.

1.2 Scope

  • In scope: Multi-tenant network isolation, VPC/subnet management, IP address allocation (DHCP/static), security groups, NAT (SNAT/DNAT), OVN integration, network policies, tenant-scoped networking
  • Out of scope: Physical network infrastructure management, BGP routing configuration, hardware load balancer integration (handled by FiberLB), DNS resolution (handled by FlashDNS), network monitoring/analytics (future)

1.3 Design Goals

  • Strong tenant isolation: Complete network separation between organizations and projects by default
  • OVN-based: Leverage mature OVN platform for proven multi-tenant networking
  • PlasmaVMC integration: Seamless integration with VM lifecycle management
  • Automatic IPAM: DHCP-based IP allocation with optional static assignment
  • Security-first: Security groups and network policies enforced at the network layer
  • Scalable: Support thousands of VMs across multiple tenants

2. Architecture

2.1 Crate Structure

overlay-network/
├── crates/
│   ├── overlay-network-api/      # gRPC service definitions
│   ├── overlay-network-client/   # Client library
│   ├── overlay-network-core/     # Core network logic
│   ├── overlay-network-server/   # Server binary
│   ├── overlay-network-ovn/      # OVN integration layer
│   ├── overlay-network-storage/  # Persistence layer (ChainFire)
│   └── overlay-network-types/   # Shared types
└── proto/
    └── overlay-network.proto     # Protocol definitions

2.2 Component Topology

┌─────────────────────────────────────────────────────────────┐
│                  Overlay Network Service                     │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │ Network API  │──│ Network Core │──│ OVN Adapter  │     │
│  │  (gRPC)      │  │  (VPC/Subnet)│  │ (OVN Client) │     │
│  └──────────────┘  └──────┬───────┘  └──────┬───────┘     │
│                           │                 │              │
│                    ┌──────▼──────┐   ┌──────▼──────┐      │
│                    │ ChainFire   │   │ OVN North   │      │
│                    │ (state)     │   │ (logical)   │      │
│                    └─────────────┘   └─────────────┘      │
└─────────────────────────────────────────────────────────────┘
                           │
        ┌──────────────────┼──────────────────┐
        │                  │                  │
┌───────▼──────┐   ┌───────▼──────┐   ┌───────▼──────┐
│   Node 1     │   │   Node 2     │   │   Node N     │
│ ┌──────────┐ │   │ ┌──────────┐ │   │ ┌──────────┐ │
│ │ OVN      │ │   │ │ OVN      │ │   │ │ OVN      │ │
│ │ Controller│ │   │ │Controller│ │   │ │Controller│ │
│ └────┬─────┘ │   │ └────┬─────┘ │   │ └────┬─────┘ │
│      │       │   │      │       │   │      │       │
│ ┌────▼─────┐ │   │ ┌────▼─────┐ │   │ ┌────▼─────┐ │
│ │ OVS      │ │   │ │ OVS      │ │   │ │ OVS      │ │
│ │ Bridge   │ │   │ │ Bridge   │ │   │ │ Bridge   │ │
│ └────┬─────┘ │   │ └────┬─────┘ │   │ └────┬─────┘ │
│      │       │   │      │       │   │      │       │
│ ┌────▼─────┐ │   │ ┌────▼─────┐ │   │ ┌────▼─────┐ │
│ │ VM Ports │ │   │ │ VM Ports │ │   │ │ VM Ports │ │
│ └──────────┘ │   │ └──────────┘ │   │ └──────────┘ │
└──────────────┘   └──────────────┘   └──────────────┘

2.3 Data Flow

[PlasmaVMC VmService] → [NetworkService.create_port()]
                              │
                              ▼
                    [Network Core Logic]
                              │
        ┌─────────────────────┼─────────────────────┐
        │                     │                     │
        ▼                     ▼                     ▼
[ChainFire]          [OVN Northbound]        [IPAM Logic]
(state storage)      (logical network)       (IP allocation)
        │                     │                     │
        └─────────────────────┼─────────────────────┘
                              │
                              ▼
                    [OVN Southbound]
                              │
                              ▼
                    [OVN Controller]
                              │
                              ▼
                    [OVS Bridge]
                              │
                              ▼
                    [VM TAP Interface]

2.4 Dependencies

Crate Purpose
tokio Async runtime
tonic gRPC framework
prost Protocol buffers
chainfire-client State persistence
ovsdb-client OVN Northbound API client
ipnet IP address/CIDR handling
uuid Resource identifiers

3. Core Concepts

3.1 Tenant Hierarchy

Organization (org_id)
  └── Project (project_id)
       └── VPC (Virtual Private Cloud)
            └── Subnet(s)
                 └── VM Port(s)

3.2 VPC (Virtual Private Cloud)

Each project has exactly one VPC (1:1 relationship).

VPC Identifier:

vpc_id = "{org_id}/{project_id}"

OVN Mapping:

  • OVN Logical Router: Project VPC router
  • OVN Logical Switches: Subnets within VPC

CIDR Allocation:

  • Default pool: 10.0.0.0/8 divided into /16 subnets
  • Each project: /16 subnet (65,536 IPs)
  • Example: Project 1 → 10.1.0.0/16, Project 2 → 10.2.0.0/16

3.3 Subnet

Each VPC contains one or more subnets.

Subnet Identifier:

subnet_id = "{org_id}/{project_id}/{subnet_name}"

Default Subnet:

  • Created automatically with project
  • Name: default
  • CIDR: /24 within VPC CIDR (256 IPs)
  • Example: VPC 10.1.0.0/16 → Subnet 10.1.0.0/24

OVN Mapping:

  • OVN Logical Switch: Each subnet

3.4 Port (VM Network Interface)

A port represents a VM's network interface attached to a subnet.

Port Identifier:

port_id = UUID

Attributes:

  • MAC address (auto-generated or user-specified)
  • IP address (DHCP or static)
  • Security groups
  • Subnet attachment

OVN Mapping:

  • OVN Logical Port: VM's network interface

3.5 Security Group

A security group defines firewall rules for network traffic.

Security Group Identifier:

sg_id = "{org_id}/{project_id}/{sg_name}"

Default Security Group:

  • Created automatically with project
  • Name: default
  • Rules:
    • Ingress: Allow all from same security group
    • Egress: Allow all

OVN Mapping:

  • OVN ACL (Access Control List): Applied to logical ports

4. API

4.1 gRPC Services

service NetworkService {
  // VPC operations
  rpc CreateVpc(CreateVpcRequest) returns (Vpc);
  rpc GetVpc(GetVpcRequest) returns (Vpc);
  rpc ListVpcs(ListVpcsRequest) returns (ListVpcsResponse);
  rpc DeleteVpc(DeleteVpcRequest) returns (Empty);

  // Subnet operations
  rpc CreateSubnet(CreateSubnetRequest) returns (Subnet);
  rpc GetSubnet(GetSubnetRequest) returns (Subnet);
  rpc ListSubnets(ListSubnetsRequest) returns (ListSubnetsResponse);
  rpc DeleteSubnet(DeleteSubnetRequest) returns (Empty);

  // Port operations (VM NIC attachment)
  rpc CreatePort(CreatePortRequest) returns (Port);
  rpc GetPort(GetPortRequest) returns (Port);
  rpc ListPorts(ListPortsRequest) returns (ListPortsResponse);
  rpc DeletePort(DeletePortRequest) returns (Empty);
  rpc AttachPort(AttachPortRequest) returns (Port);
  rpc DetachPort(DetachPortRequest) returns (Empty);

  // Security Group operations
  rpc CreateSecurityGroup(CreateSecurityGroupRequest) returns (SecurityGroup);
  rpc GetSecurityGroup(GetSecurityGroupRequest) returns (SecurityGroup);
  rpc ListSecurityGroups(ListSecurityGroupsRequest) returns (ListSecurityGroupsResponse);
  rpc UpdateSecurityGroup(UpdateSecurityGroupRequest) returns (SecurityGroup);
  rpc DeleteSecurityGroup(DeleteSecurityGroupRequest) returns (Empty);

  // NAT operations
  rpc CreateSnat(CreateSnatRequest) returns (SnatConfig);
  rpc DeleteSnat(DeleteSnatRequest) returns (Empty);
  rpc CreateDnat(CreateDnatRequest) returns (DnatConfig);
  rpc DeleteDnat(DeleteDnatRequest) returns (Empty);
}

4.2 Key Request/Response Types

message CreateVpcRequest {
  string org_id = 1;
  string project_id = 2;
  string name = 3;
  string cidr = 4;  // Optional, auto-allocated if not specified
}

message CreateSubnetRequest {
  string org_id = 1;
  string project_id = 2;
  string vpc_id = 3;
  string name = 4;
  string cidr = 5;  // Must be within VPC CIDR
  bool dhcp_enabled = 6;
  repeated string dns_servers = 7;
}

message CreatePortRequest {
  string org_id = 1;
  string project_id = 2;
  string subnet_id = 3;
  string vm_id = 4;
  string mac_address = 5;  // Optional, auto-generated if not specified
  string ip_address = 6;   // Optional, DHCP if not specified
  repeated string security_group_ids = 7;
}

message CreateSecurityGroupRequest {
  string org_id = 1;
  string project_id = 2;
  string name = 3;
  string description = 4;
  repeated SecurityRule ingress_rules = 5;
  repeated SecurityRule egress_rules = 6;
}

message SecurityRule {
  Protocol protocol = 1;
  PortRange port_range = 2;  // Optional, all ports if not specified
  SourceType source_type = 3;
  string source = 4;  // CIDR or security_group_id
}

enum Protocol {
  PROTOCOL_UNSPECIFIED = 0;
  PROTOCOL_TCP = 1;
  PROTOCOL_UDP = 2;
  PROTOCOL_ICMP = 3;
  PROTOCOL_ALL = 4;
}

message PortRange {
  uint32 min = 1;
  uint32 max = 2;
}

enum SourceType {
  SOURCE_TYPE_UNSPECIFIED = 0;
  SOURCE_TYPE_CIDR = 1;
  SOURCE_TYPE_SECURITY_GROUP = 2;
}

4.3 Public Traits

pub trait NetworkService: Send + Sync {
    async fn create_vpc(&self, req: CreateVpcRequest) -> Result<Vpc>;
    async fn get_vpc(&self, org_id: &str, project_id: &str) -> Result<Option<Vpc>>;
    async fn create_subnet(&self, req: CreateSubnetRequest) -> Result<Subnet>;
    async fn create_port(&self, req: CreatePortRequest) -> Result<Port>;
    async fn attach_port_to_vm(&self, port_id: &str, vm_id: &str) -> Result<()>;
    async fn create_security_group(&self, req: CreateSecurityGroupRequest) -> Result<SecurityGroup>;
}

4.4 Client Library

let client = NetworkClient::connect("http://localhost:8081").await?;

// Create VPC
let vpc = client.create_vpc(CreateVpcRequest {
    org_id: "org1".to_string(),
    project_id: "proj1".to_string(),
    name: "my-vpc".to_string(),
    cidr: None, // Auto-allocate
}).await?;

// Create subnet
let subnet = client.create_subnet(CreateSubnetRequest {
    org_id: "org1".to_string(),
    project_id: "proj1".to_string(),
    vpc_id: vpc.id.clone(),
    name: "default".to_string(),
    cidr: "10.1.0.0/24".to_string(),
    dhcp_enabled: true,
    dns_servers: vec!["8.8.8.8".to_string()],
}).await?;

// Create port for VM
let port = client.create_port(CreatePortRequest {
    org_id: "org1".to_string(),
    project_id: "proj1".to_string(),
    subnet_id: subnet.id.clone(),
    vm_id: "vm-123".to_string(),
    mac_address: None, // Auto-generate
    ip_address: None,  // DHCP
    security_group_ids: vec!["default".to_string()],
}).await?;

5. Data Models

5.1 Core Types

pub struct Vpc {
    pub id: String,              // "{org_id}/{project_id}"
    pub org_id: String,
    pub project_id: String,
    pub name: String,
    pub cidr: String,            // "10.1.0.0/16"
    pub created_at: u64,
    pub updated_at: u64,
}

pub struct Subnet {
    pub id: String,              // "{org_id}/{project_id}/{subnet_name}"
    pub org_id: String,
    pub project_id: String,
    pub vpc_id: String,
    pub name: String,
    pub cidr: String,            // "10.1.0.0/24"
    pub gateway_ip: String,      // "10.1.0.1"
    pub dns_servers: Vec<String>,
    pub dhcp_enabled: bool,
    pub created_at: u64,
}

pub struct Port {
    pub id: String,              // UUID
    pub org_id: String,
    pub project_id: String,
    pub subnet_id: String,
    pub vm_id: Option<String>,   // None if detached
    pub mac_address: String,
    pub ip_address: Option<String>, // None if DHCP pending
    pub security_group_ids: Vec<String>,
    pub ovn_port_uuid: String,   // OVN logical port UUID
    pub created_at: u64,
}

pub struct SecurityGroup {
    pub id: String,              // "{org_id}/{project_id}/{sg_name}"
    pub org_id: String,
    pub project_id: String,
    pub name: String,
    pub description: String,
    pub ingress_rules: Vec<SecurityRule>,
    pub egress_rules: Vec<SecurityRule>,
    pub created_at: u64,
}

pub struct SecurityRule {
    pub protocol: Protocol,
    pub port_range: Option<(u16, u16)>,
    pub source_type: SourceType,
    pub source: String,
}

pub enum Protocol {
    Tcp,
    Udp,
    Icmp,
    All,
}

pub enum SourceType {
    Cidr,           // "10.1.0.0/24"
    SecurityGroup,  // "{org_id}/{project_id}/{sg_name}"
}

5.2 Storage Format

  • Engine: ChainFire (distributed KVS)
  • Serialization: JSON (via serde_json)
  • Key format: Hierarchical keys for tenant scoping

ChainFire Keys:

# VPC
/networks/vpcs/{org_id}/{project_id} = Vpc (JSON)

# Subnet
/networks/subnets/{org_id}/{project_id}/{subnet_name} = Subnet (JSON)

# Port
/networks/ports/{org_id}/{project_id}/{port_id} = Port (JSON)

# Security Group
/networks/security_groups/{org_id}/{project_id}/{sg_name} = SecurityGroup (JSON)

# IPAM
/networks/ipam/{org_id}/{project_id}/{subnet_name}/allocated = ["10.1.0.10", ...] (JSON)

# CIDR Allocation
/networks/cidr/allocations/{org_id}/{project_id} = "10.1.0.0/16" (string)
/networks/cidr/pool/used = ["10.1.0.0/16", "10.2.0.0/16", ...] (JSON)

6. Network Isolation

6.1 Inter-Tenant Isolation

Organization Level:

  • Default: Complete isolation (no communication)
  • Exception: Explicit peering configuration required

Project Level (Same Organization):

  • Default: Isolated (no communication)
  • Exception: VPC peering or shared network for connectivity

OVN Implementation:

  • Logical Switches: L2 isolation per subnet
  • Logical Routers: L3 routing control
  • ACLs: Security group enforcement

6.2 Intra-Tenant Communication

Same Subnet:

  • L2 forwarding (MAC address-based)
  • Direct communication via OVN Logical Switch

Different Subnets (Same VPC):

  • L3 routing via OVN Logical Router
  • Router forwards packets between Logical Switches

Packet Flow Example:

VM1 (10.1.0.10) → VM2 (10.1.1.10)

1. VM1 sends packet to 10.1.1.10
2. TAP interface → OVS bridge
3. OVS → OVN Logical Switch (L2, no match)
4. OVN → Logical Router (L3 forwarding)
5. Logical Router → Destination Logical Switch
6. OVN ACL check (security groups)
7. Packet forwarded to VM2's TAP interface

7. IP Address Management (IPAM)

7.1 IP Allocation Strategy

Automatic (DHCP):

  • Default allocation method
  • OVN DHCP server assigns IPs from subnet pool
  • IPs tracked in ChainFire for conflict prevention

Static Assignment:

  • User-specified IP address
  • Must be within subnet CIDR
  • Duplicate check required

IP Allocation Tracking:

/networks/ipam/{org_id}/{project_id}/{subnet_name}/allocated = ["10.1.0.10", "10.1.0.11", ...]
/networks/ipam/{org_id}/{project_id}/{subnet_name}/reserved = ["10.1.0.1", "10.1.0.254"] // gateway, broadcast

7.2 DHCP Configuration

OVN DHCP Options:

pub struct DhcpOptions {
    pub subnet_id: String,
    pub gateway_ip: String,
    pub dns_servers: Vec<String>,
    pub domain_name: Option<String>,
    pub ntp_servers: Vec<String>,
    pub lease_time: u32, // seconds
}

OVN Implementation:

  • DHCP options configured on OVN Logical Switch
  • OVN acts as DHCP server
  • VMs receive IP, gateway, DNS via DHCP

8. Security

8.1 Security Groups

Default Security Group:

  • Created automatically with project
  • Ingress: Allow from same security group
  • Egress: Allow all

OVN ACL Implementation:

  • ACLs applied to Logical Ports
  • Direction: from-lport (egress), to-lport (ingress)
  • Action: allow, drop, reject

ACL Example:

# Ingress: Allow TCP port 80 from security group "web"
from-lport 1000 "tcp && tcp.dst == 80 && ip4.src == $sg_web" allow-related

# Egress: Allow all
to-lport 1000 "1" allow

8.2 Network Policies

Policy Types:

  1. Ingress Policy: Inbound traffic control
  2. Egress Policy: Outbound traffic control
  3. Isolation Policy: Inter-network isolation settings

Implementation:

  • OVN ACLs for enforcement
  • Combined with security groups

8.3 IP Spoofing Prevention

  • OVN validates source IP addresses
  • Blocks traffic from IPs not assigned to port

8.4 ARP Spoofing Prevention

  • OVN manages ARP tables
  • Blocks invalid ARP responses

9. NAT (Network Address Translation)

9.1 SNAT (Source NAT)

Purpose: Private IP to external (Internet) communication

Configuration:

pub struct SnatConfig {
    pub vpc_id: String,
    pub external_ip: String,
    pub enabled: bool,
}

OVN Implementation:

  • SNAT rule added to Logical Router
  • ovn-nbctl lr-nat-add <router> snat <external_ip> <internal_cidr>

9.2 DNAT (Destination NAT)

Purpose: External to specific VM communication (port forwarding)

Configuration:

pub struct DnatConfig {
    pub vpc_id: String,
    pub external_ip: String,
    pub external_port: u16,
    pub internal_ip: String,
    pub internal_port: u16,
    pub protocol: Protocol,
}

OVN Implementation:

  • ovn-nbctl lr-nat-add <router> dnat <external_ip> <internal_ip>

10. Configuration

10.1 Config File Format (TOML)

[network]
ovn_northbound_endpoint = "tcp:127.0.0.1:6641"
ovn_southbound_endpoint = "tcp:127.0.0.1:6642"
cidr_pool = "10.0.0.0/8"
default_subnet_size = 24

[storage]
chainfire_endpoint = "http://127.0.0.1:50051"

[server]
listen_addr = "0.0.0.0:8081"

10.2 Environment Variables

Variable Default Description
OVERLAY_NETWORK_OVN_NB_ENDPOINT tcp:127.0.0.1:6641 OVN Northbound DB endpoint
OVERLAY_NETWORK_OVN_SB_ENDPOINT tcp:127.0.0.1:6642 OVN Southbound DB endpoint
OVERLAY_NETWORK_CHAINFIRE_ENDPOINT http://127.0.0.1:50051 ChainFire endpoint
OVERLAY_NETWORK_CIDR_POOL 10.0.0.0/8 CIDR pool for VPC allocation
OVERLAY_NETWORK_LISTEN_ADDR 0.0.0.0:8081 gRPC server listen address

10.3 CLI Arguments

overlay-network-server [OPTIONS]
  --config <PATH>              Config file path
  --ovn-nb-endpoint <ENDPOINT> OVN Northbound endpoint
  --ovn-sb-endpoint <ENDPOINT> OVN Southbound endpoint
  --chainfire-endpoint <URL>   ChainFire endpoint
  --listen-addr <ADDR>         gRPC listen address

11. Operations

11.1 Deployment

Single Node:

  • Network service runs alongside PlasmaVMC control plane
  • OVN Northbound/Southbound DBs on same node
  • Suitable for development/testing

Cluster Mode:

  • Network service can be distributed
  • OVN databases replicated (OVSDB clustering)
  • ChainFire provides distributed state

11.2 Monitoring

Metrics (Prometheus format):

  • overlay_network_vpcs_total: Total number of VPCs
  • overlay_network_subnets_total: Total number of subnets
  • overlay_network_ports_total: Total number of ports
  • overlay_network_ip_allocations_total: Total IP allocations
  • overlay_network_ovn_operations_duration_seconds: OVN operation latency

Health Endpoints:

  • /health: Service health check
  • /ready: Readiness check (OVN connectivity)

11.3 Integration with PlasmaVMC

VM Creation Flow:

1. VmService.create_vm() called with NetworkSpec
2. NetworkService.create_port() creates OVN Logical Port
3. OVN assigns IP address (DHCP or static)
4. Security groups applied to port (OVN ACLs)
5. VM NIC attached to port (TAP interface)

NetworkSpec Extension:

pub struct NetworkSpec {
    pub id: String,
    pub network_id: String,        // subnet_id: "{org_id}/{project_id}/{subnet_name}"
    pub mac_address: Option<String>,
    pub ip_address: Option<String>, // None = DHCP
    pub model: NicModel,
    pub security_groups: Vec<String>, // security_group_ids
}

12. Compatibility

12.1 API Versioning

  • Version scheme: Semantic versioning (v1.0, v1.1, etc.)
  • Deprecation policy: 2 major versions support
  • Breaking changes: New major version

12.2 Wire Protocol

  • Protocol buffer version: proto3
  • Backward compatibility: Maintained within major version

Appendix

A. Error Codes

Code Meaning
INVALID_CIDR Invalid CIDR format
CIDR_OVERLAP CIDR overlaps with existing allocation
SUBNET_OUTSIDE_VPC Subnet CIDR not within VPC CIDR
IP_OUTSIDE_SUBNET IP address not within subnet CIDR
IP_ALREADY_ALLOCATED IP address already in use
OVN_CONNECTION_FAILED Failed to connect to OVN
SECURITY_GROUP_NOT_FOUND Security group does not exist
PORT_ALREADY_ATTACHED Port already attached to VM

B. Glossary

  • VPC (Virtual Private Cloud): Isolated network environment per project
  • Subnet: L2 network segment within VPC
  • Port: VM network interface attached to subnet
  • Security Group: Firewall rules for network traffic
  • OVN (Open Virtual Network): Network virtualization platform
  • Logical Switch: OVN L2 network construct
  • Logical Router: OVN L3 routing construct
  • Logical Port: OVN port attached to Logical Switch
  • ACL (Access Control List): OVN firewall rules
  • IPAM (IP Address Management): IP allocation and tracking
  • SNAT (Source NAT): Outbound NAT for external connectivity
  • DNAT (Destination NAT): Inbound NAT for port forwarding

C. OVN Integration Details

OVN Northbound Operations:

  • Create Logical Switch: ovn-nbctl ls-add <switch>
  • Create Logical Router: ovn-nbctl lr-add <router>
  • Create Logical Port: ovn-nbctl lsp-add <switch> <port>
  • Set DHCP Options: ovn-nbctl dhcp-options-create
  • Add ACL: ovn-nbctl acl-add <switch> <direction> <priority> <match> <action>
  • Add NAT: ovn-nbctl lr-nat-add <router> <type> <external_ip> <internal_ip>

OVN Southbound State:

  • Physical port bindings
  • Flow table entries
  • Chassis mappings