- Replace form_urlencoded with RFC 3986 compliant URI encoding - Implement aws_uri_encode() matching AWS SigV4 spec exactly - Unreserved chars (A-Z,a-z,0-9,-,_,.,~) not encoded - All other chars percent-encoded with uppercase hex - Preserve slashes in paths, encode in query params - Normalize empty paths to '/' per AWS spec - Fix test expectations (body hash, HMAC values) - Add comprehensive SigV4 signature determinism test This fixes the canonicalization mismatch that caused signature validation failures in T047. Auth can now be enabled for production. Refs: T058.S1
14 KiB
PlasmaCloud Netboot Image Builder - Technical Overview
Introduction
This document provides a technical overview of the PlasmaCloud NixOS Image Builder, which generates bootable netboot images for bare-metal provisioning. This is part of T032 (Bare-Metal Provisioning) and specifically implements deliverable S3 (NixOS Image Builder).
System Architecture
High-Level Flow
┌─────────────────────┐
│ Nix Flake │
│ (flake.nix) │
└──────────┬──────────┘
│
├─── nixosConfigurations
│ ├── netboot-control-plane
│ ├── netboot-worker
│ └── netboot-all-in-one
│
├─── packages (T024)
│ ├── chainfire-server
│ ├── flaredb-server
│ └── ... (8 services)
│
└─── modules (T024)
├── chainfire.nix
├── flaredb.nix
└── ... (8 modules)
Build Process
↓
┌─────────────────────┐
│ build-images.sh │
└──────────┬──────────┘
│
├─── nix build netbootRamdisk
├─── nix build kernel
└─── copy to artifacts/
Output
↓
┌─────────────────────┐
│ Netboot Artifacts │
├─────────────────────┤
│ bzImage (kernel) │
│ initrd (ramdisk) │
│ netboot.ipxe │
└─────────────────────┘
│
├─── PXE Server
│ (HTTP/TFTP)
│
└─── Target Machine
(PXE Boot)
Component Breakdown
1. Netboot Configurations
Located in nix/images/, these NixOS configurations define the netboot environment:
netboot-base.nix
Purpose: Common base configuration for all profiles
Key Features:
- Extends
netboot-minimal.nixfrom nixpkgs - SSH server with root login (key-based only)
- Generic kernel with broad hardware support
- Disk management tools (disko, parted, cryptsetup, lvm2)
- Network configuration (DHCP, predictable interface names)
- Serial console support (ttyS0, tty0)
- Minimal system (no docs, no sound)
Package Inclusions:
disko, parted, gptfdisk # Disk management
cryptsetup, lvm2 # Encryption and LVM
e2fsprogs, xfsprogs # Filesystem tools
iproute2, curl, tcpdump # Network tools
vim, tmux, htop # System tools
Kernel Configuration:
boot.kernelPackages = pkgs.linuxPackages_latest;
boot.kernelParams = [
"console=ttyS0,115200"
"console=tty0"
"loglevel=4"
];
netboot-control-plane.nix
Purpose: Full control plane deployment
Imports:
netboot-base.nix(base configuration)../modules(PlasmaCloud service modules)
Service Inclusions:
- Chainfire (ports 2379, 2380, 2381)
- FlareDB (ports 2479, 2480)
- IAM (port 8080)
- PlasmaVMC (port 8081)
- PrismNET (port 8082)
- FlashDNS (port 53)
- FiberLB (port 8083)
- LightningStor (port 8084)
- K8sHost (port 8085)
Service State: All services disabled by default via lib.mkDefault false
Resource Limits (for netboot environment):
MemoryMax = "512M"
CPUQuota = "50%"
netboot-worker.nix
Purpose: Compute-focused worker nodes
Imports:
netboot-base.nix../modules
Service Inclusions:
- PlasmaVMC (VM management)
- PrismNET (SDN)
Additional Features:
- KVM virtualization support
- Open vSwitch for SDN
- QEMU and libvirt tools
- Optimized sysctl for VM workloads
Performance Tuning:
"fs.file-max" = 1000000;
"net.ipv4.ip_forward" = 1;
"net.core.netdev_max_backlog" = 5000;
netboot-all-in-one.nix
Purpose: Single-node deployment with all services
Imports:
netboot-base.nix../modules
Combines: All features from control-plane + worker
Use Cases:
- Development environments
- Small deployments
- Edge locations
- POC installations
2. Flake Integration
The main flake.nix exposes netboot configurations:
nixosConfigurations = {
netboot-control-plane = nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
modules = [ ./nix/images/netboot-control-plane.nix ];
};
netboot-worker = nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
modules = [ ./nix/images/netboot-worker.nix ];
};
netboot-all-in-one = nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
modules = [ ./nix/images/netboot-all-in-one.nix ];
};
};
3. Build Script
build-images.sh orchestrates the build process:
Workflow:
- Parse command-line arguments (--profile, --output-dir)
- Create output directories
- For each profile:
- Build netboot ramdisk:
nix build ...netbootRamdisk - Build kernel:
nix build ...kernel - Copy artifacts (bzImage, initrd)
- Generate iPXE boot script
- Calculate and display sizes
- Build netboot ramdisk:
- Verify outputs (file existence, size sanity checks)
- Copy to PXE server (if available)
- Print summary
Build Commands:
nix build .#nixosConfigurations.netboot-$profile.config.system.build.netbootRamdisk
nix build .#nixosConfigurations.netboot-$profile.config.system.build.kernel
Output Structure:
artifacts/
├── control-plane/
│ ├── bzImage # ~10-30 MB
│ ├── initrd # ~100-300 MB
│ ├── netboot.ipxe # iPXE script
│ ├── build.log # Build log
│ ├── initrd-link # Nix result symlink
│ └── kernel-link # Nix result symlink
├── worker/
│ └── ... (same structure)
└── all-in-one/
└── ... (same structure)
Integration Points
T024 NixOS Modules
The netboot configurations leverage T024 service modules:
Module Structure (example: chainfire.nix):
{
options.services.chainfire = {
enable = lib.mkEnableOption "chainfire service";
port = lib.mkOption { ... };
raftPort = lib.mkOption { ... };
package = lib.mkOption { ... };
};
config = lib.mkIf cfg.enable {
users.users.chainfire = { ... };
systemd.services.chainfire = { ... };
};
}
Package Availability:
# In netboot-control-plane.nix
environment.systemPackages = with pkgs; [
chainfire-server # From flake overlay
flaredb-server # From flake overlay
# ...
];
T032.S2 PXE Infrastructure
The build script integrates with the PXE server:
Copy Workflow:
# Build script copies to:
chainfire/baremetal/pxe-server/assets/nixos/
├── control-plane/
│ ├── bzImage
│ └── initrd
├── worker/
│ ├── bzImage
│ └── initrd
└── all-in-one/
├── bzImage
└── initrd
iPXE Boot Script (generated):
#!ipxe
kernel ${boot-server}/control-plane/bzImage init=/nix/store/*/init console=ttyS0,115200
initrd ${boot-server}/control-plane/initrd
boot
Build Process Deep Dive
NixOS Netboot Build Internals
-
netboot-minimal.nix (from nixpkgs):
- Provides base netboot functionality
- Configures initrd with kexec support
- Sets up squashfs for Nix store
-
Our Extensions:
- Add PlasmaCloud service packages
- Configure SSH for nixos-anywhere
- Include provisioning tools (disko, etc.)
- Customize kernel and modules
-
Build Outputs:
- bzImage: Compressed Linux kernel
- initrd: Squashfs-compressed initial ramdisk containing:
- Minimal NixOS system
- Nix store with service packages
- Init scripts for booting
Size Optimization Strategies
Current Optimizations:
documentation.enable = false; # -50MB
documentation.nixos.enable = false; # -20MB
i18n.supportedLocales = [ "en_US" ]; # -100MB
Additional Strategies (if needed):
- Use
linuxPackages_hardened(smaller kernel) - Remove unused kernel modules
- Compress with xz instead of gzip
- On-demand package fetching from HTTP substituter
Expected Sizes:
- Control Plane: ~250-350 MB (initrd)
- Worker: ~150-250 MB (initrd)
- All-in-One: ~300-400 MB (initrd)
Boot Flow
From PXE to Running System
1. PXE Boot
├─ DHCP discovers boot server
├─ TFTP loads iPXE binary
└─ iPXE executes boot script
2. Netboot Download
├─ HTTP downloads bzImage (~20MB)
├─ HTTP downloads initrd (~200MB)
└─ kexec into NixOS installer
3. NixOS Installer (in RAM)
├─ Init system starts
├─ Network configuration (DHCP)
├─ SSH server starts
└─ Ready for nixos-anywhere
4. Installation (nixos-anywhere)
├─ SSH connection established
├─ Disk partitioning (disko)
├─ NixOS system installation
├─ Secret injection
└─ Bootloader installation
5. First Boot (from disk)
├─ GRUB/systemd-boot loads
├─ Services start (enabled)
├─ Cluster join (if configured)
└─ Running PlasmaCloud node
Customization Guide
Adding a New Service
Step 1: Create NixOS module
# nix/modules/myservice.nix
{ config, lib, pkgs, ... }:
{
options.services.myservice = {
enable = lib.mkEnableOption "myservice";
};
config = lib.mkIf cfg.enable {
systemd.services.myservice = { ... };
};
}
Step 2: Add to flake packages
# flake.nix
packages.myservice-server = buildRustWorkspace { ... };
Step 3: Include in netboot profile
# nix/images/netboot-control-plane.nix
environment.systemPackages = with pkgs; [
myservice-server
];
services.myservice = {
enable = lib.mkDefault false;
};
Creating a Custom Profile
Step 1: Create new netboot configuration
# nix/images/netboot-custom.nix
{ config, pkgs, lib, ... }:
{
imports = [
./netboot-base.nix
../modules
];
# Your customizations
environment.systemPackages = [ ... ];
}
Step 2: Add to flake
# flake.nix
nixosConfigurations.netboot-custom = nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
modules = [ ./nix/images/netboot-custom.nix ];
};
Step 3: Update build script
# build-images.sh
profiles_to_build=("control-plane" "worker" "all-in-one" "custom")
Security Model
Netboot Phase
Risk: Netboot image has root SSH access enabled
Mitigations:
- Key-based authentication only (no passwords)
- Isolated provisioning VLAN
- MAC address whitelist in DHCP
- Firewall disabled only during install
Post-Installation
Services remain disabled until final configuration enables them:
# In installed system configuration
services.chainfire.enable = true; # Overrides lib.mkDefault false
Secret Management
Secrets are NOT embedded in netboot images:
# During nixos-anywhere installation:
scp secrets/* root@target:/tmp/secrets/
# Installed system references:
services.chainfire.settings.tls = {
cert_path = "/etc/nixos/secrets/tls-cert.pem";
};
Performance Characteristics
Build Times
- First build: 30-60 minutes (downloads all dependencies)
- Incremental builds: 5-15 minutes (reuses cached artifacts)
- With local cache: 2-5 minutes
Network Requirements
- Initial download: ~2GB (nixpkgs + dependencies)
- Netboot download: ~200-400MB per node
- Installation: ~500MB-2GB (depending on services)
Hardware Requirements
Build Machine:
- CPU: 4+ cores recommended
- RAM: 8GB minimum, 16GB recommended
- Disk: 50GB free space
- Network: Broadband connection
Target Machine:
- RAM: 4GB minimum for netboot (8GB+ for production)
- Network: PXE boot support, DHCP
- Disk: Depends on disko configuration
Testing Strategy
Verification Steps
-
Syntax Validation:
nix flake check -
Build Test:
./build-images.sh --profile control-plane -
Artifact Verification:
file artifacts/control-plane/bzImage # Should be Linux kernel file artifacts/control-plane/initrd # Should be compressed data -
PXE Boot Test:
- Boot VM from netboot image
- Verify SSH access
- Check available tools (disko, parted, etc.)
-
Installation Test:
- Run nixos-anywhere on test target
- Verify successful installation
- Check service availability
Troubleshooting Matrix
| Symptom | Possible Cause | Solution |
|---|---|---|
| Build fails | Missing flakes | Enable experimental-features |
| Large initrd | Too many packages | Remove unused packages |
| SSH fails | Wrong SSH key | Update authorized_keys |
| Boot hangs | Wrong kernel params | Check console= settings |
| No network | DHCP issues | Verify useDHCP = true |
| Service missing | Package not built | Check flake overlay |
Future Enhancements
Planned Improvements
-
Image Variants:
- Minimal installer (no services)
- Debug variant (with extra tools)
- Rescue mode (for recovery)
-
Build Optimizations:
- Parallel profile builds
- Incremental rebuild detection
- Binary cache integration
-
Security Enhancements:
- Per-node SSH keys
- TPM-based secrets
- Measured boot support
-
Monitoring:
- Build metrics collection
- Size trend tracking
- Performance benchmarking
References
- NixOS Netboot: https://nixos.wiki/wiki/Netboot
- nixos-anywhere: https://github.com/nix-community/nixos-anywhere
- disko: https://github.com/nix-community/disko
- T032 Design:
docs/por/T032-baremetal-provisioning/design.md - T024 Modules:
nix/modules/
Revision History
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0 | 2025-12-10 | T032.S3 | Initial implementation |