# T032 Bare-Metal Provisioning Design Document

**Status:** Draft
**Author:** peerB
**Created:** 2025-12-10
**Last Updated:** 2025-12-10

## 1. Architecture Overview

This document outlines the design for automated bare-metal provisioning of the PlasmaCloud platform, which consists of 8 core services (Chainfire, FlareDB, IAM, PlasmaVMC, PrismNET, FlashDNS, FiberLB, and K8sHost). The provisioning system leverages NixOS's declarative configuration capabilities to enable fully automated deployment from bare hardware to a running, clustered platform.

The high-level flow follows this sequence: **PXE Boot → kexec NixOS Installer → disko Disk Partitioning → nixos-anywhere Installation → First-Boot Configuration → Running Cluster**. A bare-metal server performs a network boot via PXE/iPXE, which loads a minimal NixOS installer into RAM using kexec. The installer then connects to a provisioning server, which uses nixos-anywhere to declaratively partition disks (via disko), install NixOS with pre-configured services, and inject node-specific configuration (SSH keys, network settings, cluster join parameters, TLS certificates). On first boot, the system automatically joins existing Raft clusters (Chainfire/FlareDB) or bootstraps new ones, and all 8 services start with proper dependencies and TLS enabled.

The key components are:

- **PXE/iPXE Boot Server**: Serves boot binaries and configuration scripts via TFTP/HTTP
- **nixos-anywhere**: SSH-based remote installation tool that orchestrates the entire deployment
- **disko**: Declarative disk partitioning engine integrated with nixos-anywhere
- **kexec**: Linux kernel feature enabling fast boot into NixOS installer without full reboot
- **NixOS Flake** (from T024): Provides all service packages and NixOS modules
- **Configuration Injection System**: Manages node-specific secrets, network config, and cluster metadata
- **First-Boot Automation**: Systemd units that perform cluster join and service initialization

## 2. PXE Boot Flow

### 2.1 Boot Sequence

```
┌─────────────┐
│ Bare Metal  │
│   Server    │
└──────┬──────┘
       │ 1. UEFI/BIOS PXE ROM
       ▼
┌──────────────┐
│ DHCP Server  │  Option 93: Client Architecture (0=BIOS, 7=UEFI x64)
│              │  Option 67: Boot filename (undionly.kpxe or ipxe.efi)
│              │  Option 66: TFTP server address
└──────┬───────┘
       │ 2. DHCP OFFER with boot parameters
       ▼
┌──────────────┐
│ TFTP/HTTP    │
│   Server     │  Serves: undionly.kpxe (BIOS) or ipxe.efi (UEFI)
└──────┬───────┘
       │ 3. Download iPXE bootloader
       ▼
┌──────────────┐
│ iPXE Running │  User-class="iPXE" in DHCP request
│  (in RAM)    │
└──────┬───────┘
       │ 4. Second DHCP request (now with iPXE user-class)
       ▼
┌──────────────┐
│ DHCP Server  │  Detects user-class="iPXE"
│              │  Option 67: http://boot.server/boot.ipxe
└──────┬───────┘
       │ 5. DHCP OFFER with script URL
       ▼
┌──────────────┐
│ HTTP Server  │  Serves: boot.ipxe (iPXE script)
└──────┬───────┘
       │ 6. Download and execute boot script
       ▼
┌──────────────┐
│ iPXE Script  │  Loads: NixOS kernel + initrd + kexec
│  Execution   │
└──────┬───────┘
       │ 7. kexec into NixOS installer
       ▼
┌──────────────┐
│ NixOS Live   │  SSH enabled, waiting for nixos-anywhere
│  Installer   │
└──────────────┘
```

### 2.2 DHCP Configuration Requirements

The DHCP server must support architecture-specific boot file selection and iPXE user-class detection. For ISC DHCP server (`/etc/dhcp/dhcpd.conf`):

```dhcp
# Architecture detection (RFC 4578)
option architecture-type code 93 = unsigned integer 16;

# iPXE detection
option user-class code 77 = string;

subnet 10.0.0.0 netmask 255.255.255.0 {
  range 10.0.0.100 10.0.0.200;

  option routers 10.0.0.1;
  option domain-name-servers 10.0.0.1;

  # Boot server
  next-server 10.0.0.2;  # TFTP/HTTP server IP

  # Chainloading logic
  if exists user-class and option user-class = "iPXE" {
    # iPXE is already loaded, provide boot script via HTTP
    filename "http://10.0.0.2:8080/boot.ipxe";
  } elsif option architecture-type = 00:00 {
    # BIOS (legacy) - load iPXE via TFTP
    filename "undionly.kpxe";
  } elsif option architecture-type = 00:07 {
    # UEFI x86_64 - load iPXE via TFTP
    filename "ipxe.efi";
  } elsif option architecture-type = 00:09 {
    # UEFI x86_64 (alternate) - load iPXE via TFTP
    filename "ipxe.efi";
  } else {
    # Fallback
    filename "ipxe.efi";
  }
}
```

**Key Points:**
- **Option 93** (architecture-type): Distinguishes BIOS (0x0000) vs UEFI (0x0007/0x0009)
- **Option 66** (next-server): TFTP server IP for initial boot files
- **Option 67** (filename): Boot file name, changes based on architecture and iPXE presence
- **User-class detection**: Prevents infinite loop (iPXE downloading itself)
- **HTTP chainloading**: After iPXE loads, switch to HTTP for faster downloads

### 2.3 iPXE Script Structure

The boot script (`/srv/boot/boot.ipxe`) provides a menu for deployment profiles:

```ipxe
#!ipxe

# Variables
set boot-server 10.0.0.2:8080
set nix-cache http://${boot-server}/nix-cache

# Display system info
echo System information:
echo - Platform: ${platform}
echo - Architecture: ${buildarch}
echo - MAC: ${net0/mac}
echo - IP: ${net0/ip}
echo

# Menu with timeout
:menu
menu PlasmaCloud Bare-Metal Provisioning
item --gap -- ──────────── Deployment Profiles ────────────
item control-plane   Install Control Plane Node (Chainfire + FlareDB + IAM)
item worker          Install Worker Node (PlasmaVMC + PrismNET + Storage)
item all-in-one      Install All-in-One (All 8 Services)
item shell           Boot to NixOS Installer Shell
item --gap -- ─────────────────────────────────────────────
item --key r reboot  Reboot System
choose --timeout 30000 --default all-in-one target || goto menu

# Execute selection
goto ${target}

:control-plane
echo Booting Control Plane installer...
set profile control-plane
goto boot

:worker
echo Booting Worker Node installer...
set profile worker
goto boot

:all-in-one
echo Booting All-in-One installer...
set profile all-in-one
goto boot

:shell
echo Booting to installer shell...
set profile shell
goto boot

:boot
# Load NixOS netboot artifacts (from nixos-images or custom build)
kernel http://${boot-server}/nixos/bzImage init=/nix/store/...-nixos-system/init loglevel=4 console=ttyS0 console=tty0 nixos.profile=${profile}
initrd http://${boot-server}/nixos/initrd
boot

:reboot
reboot

:failed
echo Boot failed, dropping to shell...
sleep 10
shell
```

**Features:**
- **Multi-profile support**: Different service combinations per node type
- **Hardware detection**: Shows MAC/IP for inventory tracking
- **Timeout with default**: Unattended deployment after 30 seconds
- **Kernel parameters**: Pass profile to NixOS installer for conditional configuration
- **Error handling**: Falls back to shell on failure

### 2.4 HTTP vs TFTP Trade-offs

| Aspect | TFTP | HTTP |
|--------|------|------|
| **Speed** | ~1-5 MB/s (UDP, no windowing) | ~50-100+ MB/s (TCP with pipelining) |
| **Reliability** | Low (UDP, prone to timeouts) | High (TCP with retries) |
| **Firmware Support** | Universal (all PXE ROMs) | UEFI 2.5+ only (HTTP Boot) |
| **Complexity** | Simple protocol, minimal config | Requires web server (nginx/apache) |
| **Use Case** | Initial iPXE binary (~100KB) | Kernel/initrd/images (~100-500MB) |

**Recommended Hybrid Approach:**
1. **TFTP** for initial iPXE binary delivery (universal compatibility)
2. **HTTP** for all subsequent artifacts (kernel, initrd, scripts, packages)
3. Configure iPXE with embedded HTTP support
4. NixOS netboot images served via HTTP with range request support for resumability

**UEFI HTTP Boot Alternative:**
For pure UEFI environments, skip TFTP entirely by using DHCP Option 60 (Vendor Class = "HTTPClient") and Option 67 (HTTP URI). However, this lacks BIOS compatibility and requires newer firmware (2015+).

## 3. Image Generation Strategy

### 3.1 Building NixOS Netboot Images

NixOS provides built-in netboot image generation. We extend this to include PlasmaCloud services:

**Option 1: Custom Netboot Configuration (Recommended)**

Create `nix/images/netboot.nix`:

```nix
{ config, pkgs, lib, modulesPath, ... }:

{
  imports = [
    "${modulesPath}/installer/netboot/netboot-minimal.nix"
    ../../nix/modules  # PlasmaCloud service modules
  ];

  # Networking for installer phase
  networking = {
    usePredictableInterfaceNames = false;  # Use eth0 instead of enpXsY
    useDHCP = true;
    firewall.enable = false;  # Open during installation
  };

  # SSH for nixos-anywhere
  services.openssh = {
    enable = true;
    settings = {
      PermitRootLogin = "yes";
      PasswordAuthentication = false;  # Key-based only
    };
  };

  # Authorized keys for provisioning server
  users.users.root.openssh.authorizedKeys.keys = [
    "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIProvisioning Server Key..."
  ];

  # Minimal kernel for hardware support
  boot.kernelPackages = pkgs.linuxPackages_latest;
  boot.supportedFilesystems = [ "ext4" "xfs" "btrfs" "zfs" ];

  # Include disko for disk management
  environment.systemPackages = with pkgs; [
    disko
    parted
    cryptsetup
    lvm2
  ];

  # Disable unnecessary services for installer
  documentation.enable = false;
  documentation.nixos.enable = false;
  sound.enable = false;

  # Build artifacts needed for netboot
  system.build = {
    netbootRamdisk = config.system.build.initialRamdisk;
    kernel = config.system.build.kernel;
    netbootIpxeScript = pkgs.writeText "netboot.ipxe" ''
      #!ipxe
      kernel \${boot-url}/bzImage init=${config.system.build.toplevel}/init ${toString config.boot.kernelParams}
      initrd \${boot-url}/initrd
      boot
    '';
  };
}
```

Build the netboot artifacts:

```bash
nix build .#nixosConfigurations.netboot.config.system.build.netbootRamdisk
nix build .#nixosConfigurations.netboot.config.system.build.kernel

# Copy to HTTP server
cp result/bzImage /srv/boot/nixos/
cp result/initrd /srv/boot/nixos/
```

**Option 2: Use Pre-built Images (Faster Development)**

The [nix-community/nixos-images](https://github.com/nix-community/nixos-images) project provides pre-built netboot images:

```bash
# Use their iPXE chainload directly
chain https://github.com/nix-community/nixos-images/releases/download/nixos-unstable/netboot-x86_64-linux.ipxe

# Or download artifacts
curl -L https://github.com/nix-community/nixos-images/releases/download/nixos-unstable/bzImage -o /srv/boot/nixos/bzImage
curl -L https://github.com/nix-community/nixos-images/releases/download/nixos-unstable/initrd -o /srv/boot/nixos/initrd
```

### 3.2 Configuration Injection Approach

Configuration must be injected at installation time (not baked into netboot image) to support:
- Node-specific networking (static IPs, VLANs)
- Cluster join parameters (existing Raft leader addresses)
- TLS certificates (unique per node)
- Hardware-specific disk layouts

**Three-Phase Configuration Model:**

**Phase 1: Netboot Image (Generic)**
- Universal kernel with broad hardware support
- SSH server with provisioning key
- disko + installer tools
- No node-specific data

**Phase 2: nixos-anywhere Deployment (Node-Specific)**
- Pull node configuration from provisioning server based on MAC/hostname
- Partition disks per disko spec
- Install NixOS with flake: `github:yourorg/plasmacloud#node-hostname`
- Inject secrets: `/etc/nixos/secrets/` (TLS certs, cluster tokens)

**Phase 3: First Boot (Service Initialization)**
- systemd service reads `/etc/nixos/secrets/cluster-config.json`
- Auto-join Chainfire cluster (or bootstrap if first node)
- FlareDB joins after Chainfire is healthy
- IAM initializes with FlareDB backend
- Other services start with proper dependencies

**Configuration Repository Structure:**

```
/srv/provisioning/
├── nodes/
│   ├── node01.example.com/
│   │   ├── hardware.nix          # Generated from nixos-generate-config
│   │   ├── configuration.nix     # Node-specific service config
│   │   ├── disko.nix              # Disk layout
│   │   └── secrets/
│   │       ├── tls-cert.pem
│   │       ├── tls-key.pem
│   │       ├── tls-ca.pem
│   │       └── cluster-config.json
│   └── node02.example.com/
│       └── ...
├── profiles/
│   ├── control-plane.nix         # Chainfire + FlareDB + IAM
│   ├── worker.nix                # PlasmaVMC + storage
│   └── all-in-one.nix            # All 8 services
└── common/
    ├── base.nix                  # Common settings (SSH, users, firewall)
    └── networking.nix            # Network defaults
```

**Node Configuration Example (`nodes/node01.example.com/configuration.nix`):**

```nix
{ config, pkgs, lib, ... }:

{
  imports = [
    ../../profiles/control-plane.nix
    ../../common/base.nix
    ./hardware.nix
    ./disko.nix
  ];

  networking = {
    hostName = "node01";
    domain = "example.com";
    interfaces.eth0 = {
      useDHCP = false;
      ipv4.addresses = [{
        address = "10.0.1.10";
        prefixLength = 24;
      }];
    };
    defaultGateway = "10.0.1.1";
    nameservers = [ "10.0.1.1" ];
  };

  # Service configuration
  services.chainfire = {
    enable = true;
    port = 2379;
    raftPort = 2380;
    gossipPort = 2381;
    settings = {
      node_id = "node01";
      cluster_name = "prod-cluster";
      # Initial cluster peers (for bootstrap)
      initial_peers = [
        "node01.example.com:2380"
        "node02.example.com:2380"
        "node03.example.com:2380"
      ];
      tls = {
        cert_path = "/etc/nixos/secrets/tls-cert.pem";
        key_path = "/etc/nixos/secrets/tls-key.pem";
        ca_path = "/etc/nixos/secrets/tls-ca.pem";
      };
    };
  };

  services.flaredb = {
    enable = true;
    port = 2479;
    raftPort = 2480;
    settings = {
      node_id = "node01";
      cluster_name = "prod-cluster";
      chainfire_endpoint = "https://localhost:2379";
      tls = {
        cert_path = "/etc/nixos/secrets/tls-cert.pem";
        key_path = "/etc/nixos/secrets/tls-key.pem";
        ca_path = "/etc/nixos/secrets/tls-ca.pem";
      };
    };
  };

  services.iam = {
    enable = true;
    port = 8080;
    settings = {
      flaredb_endpoint = "https://localhost:2479";
      tls = {
        cert_path = "/etc/nixos/secrets/tls-cert.pem";
        key_path = "/etc/nixos/secrets/tls-key.pem";
      };
    };
  };

  system.stateVersion = "24.11";
}
```

### 3.3 Hardware Detection vs Explicit Hardware Config

**Hardware Detection (Automatic):**

During installation, `nixos-generate-config` scans hardware and creates `hardware-configuration.nix`:

```bash
# On live installer, after disk setup
nixos-generate-config --root /mnt --show-hardware-config > /tmp/hardware.nix

# Upload to provisioning server
curl -X POST -F "file=@/tmp/hardware.nix" http://provisioning-server/api/hardware/node01
```

**Explicit Hardware Config (Declarative):**

For homogeneous hardware (e.g., fleet of identical servers), use a template:

```nix
# profiles/hardware/dell-r640.nix
{ config, lib, pkgs, modulesPath, ... }:

{
  imports = [ (modulesPath + "/installer/scan/not-detected.nix") ];

  boot.initrd.availableKernelModules = [ "xhci_pci" "ahci" "nvme" "usbhid" "sd_mod" ];
  boot.kernelModules = [ "kvm-intel" ];

  # Network interfaces (predictable naming)
  networking.interfaces = {
    enp59s0f0 = {};  # 10GbE Port 1
    enp59s0f1 = {};  # 10GbE Port 2
  };

  # CPU microcode updates
  hardware.cpu.intel.updateMicrocode = true;

  # Power management
  powerManagement.cpuFreqGovernor = "performance";

  nixpkgs.hostPlatform = "x86_64-linux";
}
```

**Recommendation:**
- **Phase 1 (Development):** Auto-detect hardware for flexibility
- **Phase 2 (Production):** Standardize on explicit hardware profiles for consistency and faster deployments

### 3.4 Image Size Optimization

Netboot images must fit in RAM (typically 1-4 GB available after kexec). Strategies:

**1. Exclude Documentation and Locales:**
```nix
documentation.enable = false;
documentation.nixos.enable = false;
i18n.supportedLocales = [ "en_US.UTF-8/UTF-8" ];
```

**2. Minimal Kernel:**
```nix
boot.kernelPackages = pkgs.linuxPackages_latest;
boot.kernelParams = [ "modprobe.blacklist=nouveau" ];  # Exclude unused drivers
```

**3. Squashfs Compression:**
NixOS netboot uses squashfs for the Nix store, achieving ~2.5x compression:
```nix
# Automatically applied by netboot-minimal.nix
system.build.squashfsStore = ...;  # Default: gzip compression
```

**4. On-Demand Package Fetching:**
Instead of bundling all packages, fetch from HTTP substituter during installation:
```nix
nix.settings.substituters = [ "http://10.0.0.2:8080/nix-cache" ];
nix.settings.trusted-public-keys = [ "cache-key-here" ];
```

**Expected Sizes:**
- **Minimal installer (no services):** ~150-250 MB (initrd)
- **Installer + PlasmaCloud packages:** ~400-600 MB (with on-demand fetch)
- **Full offline installer:** ~1-2 GB (includes all service closures)

## 4. Installation Flow

### 4.1 Step-by-Step Process

**1. PXE Boot to NixOS Installer (Automated)**

- Server powers on, sends DHCP request
- DHCP provides iPXE binary (via TFTP)
- iPXE loads, sends second DHCP request with user-class
- DHCP provides boot script URL (via HTTP)
- iPXE downloads script, executes, loads kernel+initrd
- kexec into NixOS installer (in RAM, ~30-60 seconds)
- Installer boots, acquires IP via DHCP, starts SSH server

**2. Provisioning Server Detects Node (Semi-Automated)**

Provisioning server monitors DHCP leases or receives webhook from installer:

```bash
# Installer sends registration on boot (custom init script)
curl -X POST http://provisioning-server/api/register \
  -d '{"mac":"aa:bb:cc:dd:ee:ff","ip":"10.0.0.100","hostname":"node01"}'
```

Provisioning server looks up node in inventory:
```bash
# /srv/provisioning/inventory.json
{
  "nodes": {
    "aa:bb:cc:dd:ee:ff": {
      "hostname": "node01.example.com",
      "profile": "control-plane",
      "config_path": "/srv/provisioning/nodes/node01.example.com"
    }
  }
}
```

**3. Run nixos-anywhere (Automated)**

Provisioning server executes nixos-anywhere:

```bash
#!/bin/bash
# /srv/provisioning/scripts/provision-node.sh

NODE_MAC="$1"
NODE_IP=$(get_ip_from_dhcp "$NODE_MAC")
NODE_HOSTNAME=$(lookup_hostname "$NODE_MAC")
CONFIG_PATH="/srv/provisioning/nodes/$NODE_HOSTNAME"

# Copy secrets to installer (will be injected during install)
ssh root@$NODE_IP "mkdir -p /tmp/secrets"
scp $CONFIG_PATH/secrets/* root@$NODE_IP:/tmp/secrets/

# Run nixos-anywhere with disko
nix run github:nix-community/nixos-anywhere -- \
  --flake "/srv/provisioning#$NODE_HOSTNAME" \
  --build-on-remote \
  --disk-encryption-keys /tmp/disk.key <(cat $CONFIG_PATH/secrets/disk-encryption.key) \
  root@$NODE_IP
```

nixos-anywhere performs:
- Detects existing OS (if any)
- Loads kexec if needed (already done via PXE)
- Runs disko to partition disks (based on `$CONFIG_PATH/disko.nix`)
- Builds NixOS system closure (either locally or on target)
- Copies closure to `/mnt` (mounted root)
- Installs bootloader (GRUB/systemd-boot)
- Copies secrets to `/mnt/etc/nixos/secrets/`
- Unmounts, reboots

**4. First Boot into Installed System (Automated)**

Server reboots from disk (GRUB/systemd-boot), loads NixOS:

- systemd starts
- `chainfire.service` starts (waits 30s for network)
- If `initial_peers` matches only self → bootstrap new cluster
- If `initial_peers` includes others → attempt to join existing cluster
- `flaredb.service` starts after chainfire is healthy
- `iam.service` starts after flaredb is healthy
- Other services start based on profile

**First-boot cluster join logic** (systemd unit):

```nix
# /etc/nixos/first-boot-cluster-join.nix
{ config, lib, pkgs, ... }:

let
  clusterConfig = builtins.fromJSON (builtins.readFile /etc/nixos/secrets/cluster-config.json);
in
{
  systemd.services.chainfire-cluster-join = {
    description = "Chainfire Cluster Join";
    after = [ "network-online.target" "chainfire.service" ];
    wants = [ "network-online.target" ];
    wantedBy = [ "multi-user.target" ];

    serviceConfig = {
      Type = "oneshot";
      RemainAfterExit = true;
    };

    script = ''
      # Wait for local chainfire to be ready
      until ${pkgs.curl}/bin/curl -k https://localhost:2379/health; do
        echo "Waiting for local chainfire..."
        sleep 5
      done

      # Check if this is the first node (bootstrap)
      if [ "${clusterConfig.bootstrap}" = "true" ]; then
        echo "Bootstrap node, cluster already initialized"
        exit 0
      fi

      # Join existing cluster
      LEADER_URL="${clusterConfig.leader_url}"
      NODE_ID="${clusterConfig.node_id}"
      RAFT_ADDR="${clusterConfig.raft_addr}"

      ${pkgs.curl}/bin/curl -k -X POST "$LEADER_URL/admin/member/add" \
        -H "Content-Type: application/json" \
        -d "{\"id\":\"$NODE_ID\",\"raft_addr\":\"$RAFT_ADDR\"}"

      echo "Cluster join initiated"
    '';
  };

  # Similar for flaredb
  systemd.services.flaredb-cluster-join = {
    description = "FlareDB Cluster Join";
    after = [ "chainfire-cluster-join.service" "flaredb.service" ];
    requires = [ "chainfire-cluster-join.service" ];
    # ... similar logic
  };
}
```

**5. Validation (Manual/Automated)**

Provisioning server polls health endpoints:

```bash
# Health check script
curl -k https://10.0.1.10:2379/health  # Chainfire
curl -k https://10.0.1.10:2479/health  # FlareDB
curl -k https://10.0.1.10:8080/health  # IAM

# Cluster status
curl -k https://10.0.1.10:2379/admin/cluster/members | jq
```

### 4.2 Error Handling and Recovery

**Boot Failures:**
- **Symptom:** Server stuck in PXE boot loop
- **Diagnosis:** Check DHCP server logs, verify TFTP/HTTP server accessibility
- **Recovery:** Fix DHCP config, restart services, retry boot

**Disk Partitioning Failures:**
- **Symptom:** nixos-anywhere fails during disko phase
- **Diagnosis:** SSH to installer, run `dmesg | grep -i error`, check disk accessibility
- **Recovery:** Adjust disko config (e.g., wrong disk device), re-run nixos-anywhere

**Installation Failures:**
- **Symptom:** nixos-anywhere fails during installation phase
- **Diagnosis:** Check nixos-anywhere output, SSH to `/mnt` to inspect
- **Recovery:** Fix configuration errors, re-run nixos-anywhere (will reformat)

**Cluster Join Failures:**
- **Symptom:** Service starts but not in cluster
- **Diagnosis:** `journalctl -u chainfire-cluster-join`, check leader reachability
- **Recovery:** Manually run join command, verify TLS certs, check firewall

**Rollback Strategy:**
- NixOS generations provide atomic rollback: `nixos-rebuild switch --rollback`
- For catastrophic failure: Re-provision from PXE (data loss if not replicated)

### 4.3 Network Requirements

**DHCP:**
- Option 66/67 for PXE boot
- Option 93 for architecture detection
- User-class filtering for iPXE chainload
- Static reservations for production nodes (optional)

**DNS:**
- Forward and reverse DNS for all nodes (required for TLS cert CN verification)
- Example: `node01.example.com` → `10.0.1.10`, `10.0.1.10` → `node01.example.com`

**Firewall:**
- Allow TFTP (UDP 69) from nodes to boot server
- Allow HTTP (TCP 80/8080) from nodes to boot/provisioning server
- Allow SSH (TCP 22) from provisioning server to nodes
- Allow service ports (2379-2381, 2479-2480, 8080, etc.) between cluster nodes

**Internet Access:**
- **During installation:** Required for Nix binary cache (cache.nixos.org) unless using local cache
- **After installation:** Optional (recommended for updates), can run air-gapped with local cache
- **Workaround:** Set up local binary cache: `nix-serve` + nginx

**Bandwidth:**
- **PXE boot:** ~200 MB (kernel + initrd) per node, sequential is acceptable
- **Installation:** ~1-5 GB (Nix closures) per node, parallel ok if cache is local
- **Recommendation:** 1 Gbps link between provisioning server and nodes

## 5. Integration Points

### 5.1 T024 NixOS Modules

The NixOS modules from T024 (`nix/modules/*.nix`) provide declarative service configuration. They are included in node configurations:

```nix
{ config, pkgs, lib, ... }:

{
  imports = [
    # Import PlasmaCloud service modules
    inputs.plasmacloud.nixosModules.default
  ];

  # Enable services declaratively
  services.chainfire.enable = true;
  services.flaredb.enable = true;
  services.iam.enable = true;
  # ... etc
}
```

**Module Integration Strategy:**

1. **Flake Inputs:** Node configurations reference the PlasmaCloud flake:
   ```nix
   # flake.nix for provisioning repo
   inputs.plasmacloud.url = "github:yourorg/plasmacloud";
   # or path-based for development
   inputs.plasmacloud.url = "path:/path/to/plasmacloud/repo";
   ```

2. **Service Packages:** Packages are injected via overlay:
   ```nix
   nixpkgs.overlays = [ inputs.plasmacloud.overlays.default ];
   # Now pkgs.chainfire-server, pkgs.flaredb-server, etc. are available
   ```

3. **Dependency Graph:** systemd units respect T024 dependencies:
   ```
   chainfire.service
     ↓ requires/after
   flaredb.service
     ↓ requires/after
   iam.service
     ↓ requires/after
   plasmavmc.service, flashdns.service, ... (parallel)
   ```

4. **Configuration Schema:** Use `services.<name>.settings` for service-specific config:
   ```nix
   services.chainfire.settings = {
     node_id = "node01";
     cluster_name = "prod";
     tls = { ... };
   };
   ```

### 5.2 T027 Config Unification

T027 established a unified configuration approach (clap + config file/env). This integrates with NixOS in two ways:

**1. NixOS Module → Config File Generation:**

The NixOS module translates `services.<name>.settings` to a config file:

```nix
# In nix/modules/chainfire.nix
systemd.services.chainfire = {
  preStart = ''
    # Generate config file from settings
    cat > /var/lib/chainfire/config.toml <<EOF
    node_id = "${cfg.settings.node_id}"
    cluster_name = "${cfg.settings.cluster_name}"

    [tls]
    cert_path = "${cfg.settings.tls.cert_path}"
    key_path = "${cfg.settings.tls.key_path}"
    ca_path = "${cfg.settings.tls.ca_path or ""}"
    EOF
  '';

  serviceConfig.ExecStart = "${cfg.package}/bin/chainfire-server --config /var/lib/chainfire/config.toml";
};
```

**2. Environment Variable Injection:**

For secrets not suitable for Nix store:

```nix
systemd.services.chainfire.serviceConfig = {
  EnvironmentFile = "/etc/nixos/secrets/chainfire.env";
  # File contains: CHAINFIRE_API_TOKEN=secret123
};
```

**Best Practices:**
- **Public config:** Use `services.<name>.settings` (stored in Nix store, world-readable)
- **Secrets:** Use `EnvironmentFile` or systemd credentials
- **Hybrid:** Config file with placeholders, secrets injected at runtime

### 5.3 T031 TLS Certificates

T031 added TLS to all 8 services. Provisioning must handle certificate distribution:

**Certificate Provisioning Strategies:**

**Option 1: Pre-Generated Certificates (Simple)**

1. Generate certs on provisioning server per node:
   ```bash
   # /srv/provisioning/scripts/generate-certs.sh node01.example.com
   openssl req -x509 -newkey rsa:4096 -nodes \
     -keyout node01-key.pem -out node01-cert.pem \
     -days 365 -subj "/CN=node01.example.com"
   ```

2. Copy to node secrets directory:
   ```bash
   cp node01-*.pem /srv/provisioning/nodes/node01.example.com/secrets/
   ```

3. nixos-anywhere installs them to `/etc/nixos/secrets/` (mode 0400, owner root)

4. NixOS module references them:
   ```nix
   services.chainfire.settings.tls = {
     cert_path = "/etc/nixos/secrets/tls-cert.pem";
     key_path = "/etc/nixos/secrets/tls-key.pem";
     ca_path = "/etc/nixos/secrets/tls-ca.pem";
   };
   ```

**Option 2: ACME (Let's Encrypt) for External Services**

For internet-facing services (e.g., PlasmaVMC API):

```nix
security.acme = {
  acceptTerms = true;
  defaults.email = "admin@example.com";
};

services.plasmavmc.settings.tls = {
  cert_path = config.security.acme.certs."plasmavmc.example.com".directory + "/cert.pem";
  key_path = config.security.acme.certs."plasmavmc.example.com".directory + "/key.pem";
};

security.acme.certs."plasmavmc.example.com" = {
  domain = "plasmavmc.example.com";
  # Use DNS-01 challenge for internal servers
  dnsProvider = "cloudflare";
  credentialsFile = "/etc/nixos/secrets/cloudflare-api-token";
};
```

**Option 3: Internal CA with Cert-Manager (Advanced)**

1. Deploy cert-manager as a service on control plane
2. Generate per-node CSRs during first boot
3. Cert-manager signs and distributes certs
4. Systemd timer renews certs before expiry

**Recommendation:**
- **Phase 1 (MVP):** Pre-generated certs (Option 1)
- **Phase 2 (Production):** ACME for external + internal CA for internal (Option 2+3)

### 5.4 Chainfire/FlareDB Cluster Join

**Bootstrap (First 3 Nodes):**

First node (`node01`):
```nix
services.chainfire.settings = {
  node_id = "node01";
  initial_peers = [
    "node01.example.com:2380"
    "node02.example.com:2380"
    "node03.example.com:2380"
  ];
  bootstrap = true;  # This node starts the cluster
};
```

Subsequent nodes (`node02`, `node03`):
```nix
services.chainfire.settings = {
  node_id = "node02";
  initial_peers = [
    "node01.example.com:2380"
    "node02.example.com:2380"
    "node03.example.com:2380"
  ];
  bootstrap = false;  # Join existing cluster
};
```

**Runtime Join (After Bootstrap):**

New nodes added to running cluster:

1. Provision node with `bootstrap = false`, `initial_peers = []`
2. First-boot service calls leader's admin API:
   ```bash
   curl -k -X POST https://node01.example.com:2379/admin/member/add \
     -H "Content-Type: application/json" \
     -d '{"id":"node04","raft_addr":"node04.example.com:2380"}'
   ```
3. Node receives cluster state, starts Raft
4. Leader replicates to new node

**FlareDB Follows Same Pattern:**

FlareDB depends on Chainfire for coordination but maintains its own Raft cluster:

```nix
services.flaredb.settings = {
  node_id = "node01";
  chainfire_endpoint = "https://localhost:2379";
  initial_peers = [ "node01:2480" "node02:2480" "node03:2480" ];
};
```

**Critical:** Ensure `chainfire.service` is healthy before starting `flaredb.service` (enforced by systemd `requires`/`after`).

### 5.5 IAM Bootstrap

IAM requires initial admin user creation. Two approaches:

**Option 1: First-Boot Initialization Script**

```nix
systemd.services.iam-bootstrap = {
  description = "IAM Initial Admin User";
  after = [ "iam.service" ];
  wantedBy = [ "multi-user.target" ];

  serviceConfig = {
    Type = "oneshot";
    RemainAfterExit = true;
  };

  script = ''
    # Check if admin exists
    if ${pkgs.curl}/bin/curl -k https://localhost:8080/api/users/admin 2>&1 | grep -q "not found"; then
      # Create admin user
      ADMIN_PASSWORD=$(cat /etc/nixos/secrets/iam-admin-password)
      ${pkgs.curl}/bin/curl -k -X POST https://localhost:8080/api/users \
        -H "Content-Type: application/json" \
        -d "{\"username\":\"admin\",\"password\":\"$ADMIN_PASSWORD\",\"role\":\"admin\"}"
      echo "Admin user created"
    else
      echo "Admin user already exists"
    fi
  '';
};
```

**Option 2: Environment Variable for Default Admin**

IAM service creates admin on first start if DB is empty:

```rust
// In iam-server main.rs
if user_count() == 0 {
    let admin_password = env::var("IAM_INITIAL_ADMIN_PASSWORD")
        .expect("IAM_INITIAL_ADMIN_PASSWORD must be set for first boot");
    create_user("admin", &admin_password, Role::Admin)?;
    info!("Initial admin user created");
}
```

```nix
systemd.services.iam.serviceConfig = {
  EnvironmentFile = "/etc/nixos/secrets/iam.env";
  # File contains: IAM_INITIAL_ADMIN_PASSWORD=random-secure-password
};
```

**Recommendation:** Use Option 2 (environment variable) for simplicity. Generate random password during node provisioning, store in secrets.

## 6. Alternatives Considered

### 6.1 nixos-anywhere vs Custom Installer

**nixos-anywhere (Chosen):**
- **Pros:**
  - Mature, actively maintained by nix-community
  - Handles kexec, disko integration, bootloader install automatically
  - SSH-based, works from any OS (no need for NixOS on provisioning server)
  - Supports remote builds and disk encryption out of box
  - Well-documented with many examples
- **Cons:**
  - Requires SSH access (not suitable for zero-touch provisioning without PXE+SSH)
  - Opinionated workflow (less flexible than custom scripts)
  - Dependency on external project (but very stable)

**Custom Installer (Rejected):**
- **Pros:**
  - Full control over installation flow
  - Could implement zero-touch (e.g., installer pulls config from server without SSH)
  - Tailored to PlasmaCloud-specific needs
- **Cons:**
  - Significant development effort (partitioning, bootloader, error handling)
  - Reinvents well-tested code (disko, kexec integration)
  - Maintenance burden (keep up with NixOS changes)
  - Higher risk of bugs (partitioning is error-prone)

**Decision:** Use nixos-anywhere for reliability and speed. The SSH requirement is acceptable since PXE boot already provides network access, and adding SSH keys to the netboot image is straightforward.

### 6.2 Disk Management Tools

**disko (Chosen):**
- **Pros:**
  - Declarative, fits NixOS philosophy
  - Integrates with nixos-anywhere out of box
  - Supports complex layouts (RAID, LVM, LUKS, ZFS, btrfs)
  - Idempotent (can reformat or verify existing layout)
- **Cons:**
  - Nix-based DSL (learning curve)
  - Limited to Linux filesystems (no Windows support, not relevant here)

**Kickstart/Preseed (Rejected):**
- Used by Fedora/Debian installers
- Not NixOS-native, would require custom integration

**Terraform with Libvirt (Rejected):**
- Good for VMs, not bare metal
- Doesn't handle disk partitioning directly

**Decision:** disko is the clear choice for NixOS deployments.

### 6.3 Boot Methods

**iPXE over TFTP/HTTP (Chosen):**
- **Pros:**
  - Universal support (BIOS + UEFI)
  - Flexible scripting (boot menus, conditional logic)
  - HTTP support for fast downloads
  - Open source, widely deployed
- **Cons:**
  - Requires DHCP configuration (Option 66/67 setup)
  - Chainloading adds complexity (but solved problem)

**UEFI HTTP Boot (Rejected):**
- **Pros:**
  - Native UEFI, no TFTP needed
  - Simpler DHCP config (just Option 60/67)
- **Cons:**
  - UEFI only (no BIOS support)
  - Firmware support inconsistent (pre-2015 servers)
  - Less flexible than iPXE scripting

**Preboot USB (Rejected):**
- Manual, not scalable for fleet deployment
- Useful for one-off installs only

**Decision:** iPXE for flexibility and compatibility. UEFI HTTP Boot could be considered later for pure UEFI fleets.

### 6.4 Configuration Management

**NixOS Flakes (Chosen):**
- **Pros:**
  - Native to NixOS, declarative
  - Reproducible builds with lock files
  - Git-based, version controlled
  - No external agent needed (systemd handles state)
- **Cons:**
  - Steep learning curve for operators unfamiliar with Nix
  - Less dynamic than Ansible (changes require rebuild)

**Ansible (Rejected for Provisioning, Useful for Orchestration):**
- **Pros:**
  - Agentless, SSH-based
  - Large ecosystem of modules
  - Dynamic, easy to patch running systems
- **Cons:**
  - Imperative (harder to guarantee state)
  - Doesn't integrate with NixOS packages/modules
  - Adds another tool to stack

**Terraform (Rejected):**
- Infrastructure-as-code, not config management
- Better for cloud VMs than bare metal

**Decision:** Use NixOS flakes for provisioning and base config. Ansible may be added later for operational tasks (e.g., rolling updates, health checks) that don't fit NixOS's declarative model.

## 7. Open Questions / Decisions Needed

### 7.1 Hardware Inventory Management

**Question:** How do we map MAC addresses to node roles and configurations?

**Options:**
1. **Manual Inventory File:** Operator maintains JSON/YAML with MAC → hostname → config mapping
2. **Auto-Discovery:** First boot prompts operator to assign role (e.g., via serial console or web UI)
3. **External CMDB:** Integrate with existing Configuration Management Database (e.g., NetBox, Nautobot)

**Recommendation:** Start with manual inventory file (simple), migrate to CMDB integration in Phase 2.

### 7.2 Secrets Management

**Question:** How are secrets (TLS keys, passwords) generated, stored, and rotated?

**Options:**
1. **File-Based (Current):** Secrets in `/srv/provisioning/nodes/*/secrets/`, copied during install
2. **Vault Integration:** Fetch secrets from HashiCorp Vault at boot time
3. **systemd Credentials:** Use systemd's encrypted credentials feature (requires systemd 250+)

**Recommendation:** Phase 1 uses file-based (simple, works today). Phase 2 adds Vault for production (centralized, auditable, rotation support).

### 7.3 Network Boot Security

**Question:** How do we prevent rogue nodes from joining the cluster?

**Concerns:**
- Attacker boots unauthorized server on network
- Installer has SSH key, could be accessed
- Node joins cluster with malicious intent

**Mitigations:**
1. **MAC Whitelist:** DHCP only serves known MAC addresses
2. **Network Segmentation:** PXE boot on isolated provisioning VLAN
3. **SSH Key Per Node:** Each node has unique authorized_keys in netboot image (complex)
4. **Cluster Authentication:** Raft join requires cluster token (not yet implemented)

**Recommendation:** Use MAC whitelist + provisioning VLAN for Phase 1. Add cluster join tokens in Phase 2 (requires Chainfire/FlareDB changes).

### 7.4 Multi-Datacenter Deployment

**Question:** How does provisioning work across geographically distributed datacenters?

**Challenges:**
- WAN latency for Nix cache fetches
- PXE boot requires local DHCP/TFTP
- Cluster join across WAN (Raft latency)

**Options:**
1. **Replicated Provisioning Server:** Deploy boot server in each datacenter, sync configs
2. **Central Provisioning with Local Cache:** Single source of truth, local Nix cache mirrors
3. **Per-DC Clusters:** Each datacenter is independent cluster, federated at application layer

**Recommendation:** Defer to Phase 2. Phase 1 assumes single datacenter or low-latency LAN.

### 7.5 Disk Encryption

**Question:** Should disks be encrypted at rest?

**Trade-offs:**
- **Pros:** Compliance (GDPR, PCI-DSS), protection against physical theft
- **Cons:** Key management complexity, can't auto-reboot (manual unlock), performance overhead (~5-10%)

**Options:**
1. **No Encryption:** Rely on physical security
2. **LUKS with Network Unlock:** Tang/Clevis for automated unlocking (requires network on boot)
3. **LUKS with Manual Unlock:** Operator enters passphrase via KVM/IPMI

**Recommendation:** Optional, configurable per deployment. Provide disko template for LUKS, let operator decide.

### 7.6 Rolling Updates

**Question:** How do we update a running cluster without downtime?

**Challenges:**
- Raft requires quorum (can't update majority simultaneously)
- Service dependencies (Chainfire → FlareDB → others)
- NixOS rebuild requires reboot (for kernel/init changes)

**Strategy:**
1. Update one node at a time (rolling)
2. Verify health before proceeding to next
3. Use `nixos-rebuild test` first (activates without bootloader change), then `switch` after validation

**Tooling:**
- Ansible playbook for orchestration
- Health check scripts (curl endpoints + check Raft status)
- Rollback plan (NixOS generations + Raft snapshot restore)

**Recommendation:** Document as runbook in Phase 1, implement automated rolling update in Phase 2 (T033?).

### 7.7 Monitoring and Alerting

**Question:** How do we monitor provisioning success/failure?

**Options:**
1. **Manual:** Operator watches terminal, checks health endpoints
2. **Log Aggregation:** Collect installer logs, index in Loki/Elasticsearch
3. **Event Webhook:** Installer posts events to monitoring system (Grafana, PagerDuty)

**Recommendation:** Phase 1 uses manual monitoring. Phase 2 adds structured logging + webhooks for fleet deployments.

### 7.8 Compatibility with Existing Infrastructure

**Question:** Can this provisioning system coexist with existing PXE infrastructure (e.g., for other OS deployments)?

**Concerns:**
- Existing DHCP config may conflict
- TFTP server may serve other boot files
- Network team may control PXE infrastructure

**Solutions:**
1. **Dedicated Provisioning VLAN:** PlasmaCloud nodes on separate network
2. **Conditional DHCP:** Use vendor-class or subnet matching to route to correct boot server
3. **Multi-Boot Menu:** iPXE menu includes options for PlasmaCloud and other OSes

**Recommendation:** Document network requirements, provide example DHCP config for common scenarios (dedicated VLAN, shared infrastructure). Coordinate with network team.

---

## Appendices

### A. Example Disko Configuration

**Single Disk with GPT and ext4:**

```nix
# nodes/node01/disko.nix
{ disks ? [ "/dev/sda" ], ... }:
{
  disko.devices = {
    disk = {
      main = {
        type = "disk";
        device = builtins.head disks;
        content = {
          type = "gpt";
          partitions = {
            ESP = {
              size = "512M";
              type = "EF00";
              content = {
                type = "filesystem";
                format = "vfat";
                mountpoint = "/boot";
              };
            };
            root = {
              size = "100%";
              content = {
                type = "filesystem";
                format = "ext4";
                mountpoint = "/";
              };
            };
          };
        };
      };
    };
  };
}
```

**RAID1 with LUKS Encryption:**

```nix
{ disks ? [ "/dev/sda" "/dev/sdb" ], ... }:
{
  disko.devices = {
    disk = {
      disk1 = {
        device = builtins.elemAt disks 0;
        type = "disk";
        content = {
          type = "gpt";
          partitions = {
            boot = {
              size = "1M";
              type = "EF02"; # BIOS boot
            };
            mdraid = {
              size = "100%";
              content = {
                type = "mdraid";
                name = "raid1";
              };
            };
          };
        };
      };
      disk2 = {
        device = builtins.elemAt disks 1;
        type = "disk";
        content = {
          type = "gpt";
          partitions = {
            boot = {
              size = "1M";
              type = "EF02";
            };
            mdraid = {
              size = "100%";
              content = {
                type = "mdraid";
                name = "raid1";
              };
            };
          };
        };
      };
    };
    mdadm = {
      raid1 = {
        type = "mdadm";
        level = 1;
        content = {
          type = "luks";
          name = "cryptroot";
          settings.allowDiscards = true;
          content = {
            type = "filesystem";
            format = "ext4";
            mountpoint = "/";
          };
        };
      };
    };
  };
}
```

### B. Complete nixos-anywhere Command Examples

**Basic Deployment:**

```bash
nix run github:nix-community/nixos-anywhere -- \
  --flake .#node01 \
  root@10.0.0.100
```

**With Build on Remote (Slow Local Machine):**

```bash
nix run github:nix-community/nixos-anywhere -- \
  --flake .#node01 \
  --build-on-remote \
  root@10.0.0.100
```

**With Disk Encryption Key:**

```bash
nix run github:nix-community/nixos-anywhere -- \
  --flake .#node01 \
  --disk-encryption-keys /tmp/luks.key <(cat /secrets/node01-luks.key) \
  root@10.0.0.100
```

**Debug Mode (Keep Installer After Failure):**

```bash
nix run github:nix-community/nixos-anywhere -- \
  --flake .#node01 \
  --debug \
  --no-reboot \
  root@10.0.0.100
```

### C. Provisioning Server Setup Script

```bash
#!/bin/bash
# /srv/provisioning/scripts/setup-provisioning-server.sh

set -euo pipefail

# Install dependencies
apt-get update
apt-get install -y nginx tftpd-hpa dnsmasq curl

# Configure TFTP
cat > /etc/default/tftpd-hpa <<EOF
TFTP_USERNAME="tftp"
TFTP_DIRECTORY="/srv/boot/tftp"
TFTP_ADDRESS="0.0.0.0:69"
TFTP_OPTIONS="--secure"
EOF

mkdir -p /srv/boot/tftp
systemctl restart tftpd-hpa

# Download iPXE binaries
curl -L http://boot.ipxe.org/undionly.kpxe -o /srv/boot/tftp/undionly.kpxe
curl -L http://boot.ipxe.org/ipxe.efi -o /srv/boot/tftp/ipxe.efi

# Configure nginx for HTTP boot
cat > /etc/nginx/sites-available/pxe <<EOF
server {
  listen 8080;
  server_name _;
  root /srv/boot;

  location / {
    autoindex on;
    try_files \$uri \$uri/ =404;
  }

  # Enable range requests for large files
  location ~* \.(iso|img|bin|efi|kpxe)$ {
    add_header Accept-Ranges bytes;
  }
}
EOF

ln -sf /etc/nginx/sites-available/pxe /etc/nginx/sites-enabled/
systemctl restart nginx

# Create directory structure
mkdir -p /srv/boot/{nixos,nix-cache,scripts}
mkdir -p /srv/provisioning/{nodes,profiles,common,scripts}

echo "Provisioning server setup complete!"
echo "Next steps:"
echo "1. Configure DHCP server (see design doc Section 2.2)"
echo "2. Build NixOS netboot image (see Section 3.1)"
echo "3. Create node configurations (see Section 3.2)"
```

### D. First-Boot Cluster Config JSON Schema

```json
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "Cluster Configuration",
  "type": "object",
  "properties": {
    "node_id": {
      "type": "string",
      "description": "Unique identifier for this node"
    },
    "bootstrap": {
      "type": "boolean",
      "description": "True if this node should bootstrap a new cluster"
    },
    "leader_url": {
      "type": "string",
      "format": "uri",
      "description": "URL of existing cluster leader (for join)"
    },
    "raft_addr": {
      "type": "string",
      "description": "This node's Raft address (host:port)"
    },
    "cluster_token": {
      "type": "string",
      "description": "Shared secret for cluster authentication (future)"
    }
  },
  "required": ["node_id", "bootstrap"],
  "if": {
    "properties": { "bootstrap": { "const": false } }
  },
  "then": {
    "required": ["leader_url", "raft_addr"]
  }
}
```

**Example for bootstrap node:**
```json
{
  "node_id": "node01",
  "bootstrap": true,
  "raft_addr": "node01.example.com:2380"
}
```

**Example for joining node:**
```json
{
  "node_id": "node04",
  "bootstrap": false,
  "leader_url": "https://node01.example.com:2379",
  "raft_addr": "node04.example.com:2380"
}
```

### E. References and Further Reading

**Primary Documentation:**
- [nixos-anywhere Quickstart](https://nix-community.github.io/nixos-anywhere/quickstart.html)
- [disko Documentation](https://github.com/nix-community/disko)
- [iPXE Examples](https://ipxe.org/examples)
- [NixOS Netboot](https://nixos.wiki/wiki/Netboot)

**Technical Specifications:**
- [RFC 4578 - DHCP Options for PXE](https://www.rfc-editor.org/rfc/rfc4578)
- [UEFI HTTP Boot Specification](https://uefi.org/specs/UEFI/2.10/32_Network_Protocols.html#http-boot)

**Community Resources:**
- [NixOS Discourse - Netboot Discussions](https://discourse.nixos.org/tag/netboot)
- [nixos-anywhere Examples](https://github.com/nix-community/nixos-anywhere/tree/main/docs)

**Related Blog Posts:**
- [iPXE Booting with NixOS (2024)](https://carlosvaz.com/posts/ipxe-booting-with-nixos/)
- [Remote Deployment with nixos-anywhere and disko](https://mich-murphy.com/nixos-anywhere-and-disko/)

---

## Revision History

| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 0.1 | 2025-12-10 | peerB | Initial draft |

---

**End of Design Document**