- netboot-base.nix with SSH key auth - Launch scripts for node01/02/03 - Node configuration.nix and disko.nix - Nix modules for first-boot automation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
62 KiB
Bare-Metal Provisioning Operator Runbook
Document Version: 1.0 Last Updated: 2025-12-10 Status: Production Ready Author: PlasmaCloud Infrastructure Team
1. Overview
1.1 What This Runbook Covers
This runbook provides comprehensive, step-by-step instructions for deploying PlasmaCloud infrastructure on bare-metal servers using automated PXE-based provisioning. By following this guide, operators will be able to:
- Deploy a complete PlasmaCloud cluster from bare hardware to running services
- Bootstrap a 3-node Raft cluster (Chainfire + FlareDB)
- Add additional nodes to an existing cluster
- Validate cluster health and troubleshoot common issues
- Perform operational tasks (updates, maintenance, recovery)
1.2 Prerequisites
Required Access and Permissions:
- Root/sudo access on provisioning server
- Physical or IPMI/BMC access to bare-metal servers
- Network access to provisioning VLAN
- SSH key pair for nixos-anywhere
Required Tools:
- NixOS with flakes enabled (provisioning workstation)
- curl, jq, ssh client
- ipmitool (optional, for remote management)
- Serial console access tool (optional)
Required Knowledge:
- Basic understanding of PXE boot process
- Linux system administration
- Network configuration (DHCP, DNS, firewall)
- NixOS basics (declarative configuration, flakes)
1.3 Architecture Diagram
┌─────────────────────────────────────────────────────────────────────────┐
│ Bare-Metal Provisioning Flow │
└─────────────────────────────────────────────────────────────────────────┘
Phase 1: PXE Boot Phase 2: Installation
┌──────────────┐ ┌──────────────────┐
│ Bare-Metal │ 1. DHCP Request │ DHCP Server │
│ Server ├─────────────────>│ (PXE Server) │
│ │ └──────────────────┘
│ (powered │ 2. TFTP Get │
│ on, PXE │ bootloader │
│ enabled) │<───────────────────────────┘
│ │
│ 3. iPXE │ 4. HTTP Get ┌──────────────────┐
│ loads │ boot.ipxe │ HTTP Server │
│ ├──────────────────>│ (nginx) │
│ │ └──────────────────┘
│ 5. iPXE │ 6. HTTP Get │
│ menu │ kernel+initrd │
│ │<───────────────────────────┘
│ │
│ 7. Boot │
│ NixOS │
│ Installer│
└──────┬───────┘
│
│ 8. SSH Connection ┌──────────────────┐
└───────────────────────────>│ Provisioning │
│ Workstation │
│ │
│ 9. Run │
│ nixos- │
│ anywhere │
└──────┬───────────┘
│
┌────────────────────┴────────────────────┐
│ │
v v
┌──────────────────────────┐ ┌──────────────────────────┐
│ 10. Partition disks │ │ 11. Install NixOS │
│ (disko) │ │ - Build system │
│ - GPT/LVM/LUKS │ │ - Copy closures │
│ - Format filesystems │ │ - Install bootloader│
│ - Mount /mnt │ │ - Inject secrets │
└──────────────────────────┘ └──────────────────────────┘
Phase 3: First Boot Phase 4: Running Cluster
┌──────────────┐ ┌──────────────────┐
│ Bare-Metal │ 12. Reboot │ NixOS System │
│ Server │ ────────────> │ (from disk) │
└──────────────┘ └──────────────────┘
│
┌───────────────────┴────────────────────┐
│ 13. First-boot automation │
│ - Chainfire cluster join/bootstrap │
│ - FlareDB cluster join/bootstrap │
│ - IAM initialization │
│ - Health checks │
└───────────────────┬────────────────────┘
│
v
┌──────────────────┐
│ Running Cluster │
│ - All services │
│ healthy │
│ - Raft quorum │
│ - TLS enabled │
└──────────────────┘
2. Hardware Requirements
2.1 Minimum Specifications Per Node
Control Plane Nodes (3-5 recommended):
- CPU: 8 cores / 16 threads (Intel Xeon or AMD EPYC)
- RAM: 32 GB DDR4 ECC
- Storage: 500 GB SSD (NVMe preferred)
- Network: 2x 10 GbE (bonded/redundant)
- BMC: IPMI 2.0 or Redfish compatible
Worker Nodes:
- CPU: 16+ cores / 32+ threads
- RAM: 64 GB+ DDR4 ECC
- Storage: 1 TB+ NVMe SSD
- Network: 2x 10 GbE or 2x 25 GbE
- BMC: IPMI 2.0 or Redfish compatible
All-in-One (Development/Testing):
- CPU: 16 cores / 32 threads
- RAM: 64 GB DDR4
- Storage: 1 TB SSD
- Network: 1x 10 GbE (minimum)
- BMC: Optional but recommended
2.2 Recommended Production Specifications
Control Plane Nodes:
- CPU: 16-32 cores (Intel Xeon Gold/Platinum or AMD EPYC)
- RAM: 64-128 GB DDR4 ECC
- Storage: 1-2 TB NVMe SSD (RAID1 for redundancy)
- Network: 2x 25 GbE (active/active bonding)
- BMC: Redfish with SOL (Serial-over-LAN)
Worker Nodes:
- CPU: 32-64 cores
- RAM: 128-256 GB DDR4 ECC
- Storage: 2-4 TB NVMe SSD
- Network: 2x 25 GbE or 2x 100 GbE
- GPU: Optional (NVIDIA/AMD for ML workloads)
2.3 Hardware Compatibility Matrix
| Vendor | Model | Tested | BIOS | UEFI | Notes |
|---|---|---|---|---|---|
| Dell | PowerEdge R640 | Yes | Yes | Yes | Requires BIOS A19+ |
| Dell | PowerEdge R650 | Yes | Yes | Yes | Best PXE compatibility |
| HPE | ProLiant DL360 | Yes | Yes | Yes | Disable Secure Boot |
| HPE | ProLiant DL380 | Yes | Yes | Yes | Latest firmware recommended |
| Supermicro | SYS-2029U | Yes | Yes | Yes | Requires BMC 1.73+ |
| Lenovo | ThinkSystem | Partial | Yes | Yes | Some NIC issues on older models |
| Generic | Whitebox x86 | Partial | Yes | Maybe | UEFI support varies |
2.4 BIOS/UEFI Settings
Required Settings:
- Boot Mode: UEFI (preferred) or Legacy BIOS
- PXE/Network Boot: Enabled on primary NIC
- Boot Order: Network → Disk
- Secure Boot: Disabled (for PXE boot)
- Virtualization: Enabled (VT-x/AMD-V)
- SR-IOV: Enabled (if using advanced networking)
Dell-Specific (iDRAC):
System BIOS → Boot Settings:
Boot Mode: UEFI
UEFI Network Stack: Enabled
PXE Device 1: Integrated NIC 1
System BIOS → System Profile:
Profile: Performance
HPE-Specific (iLO):
System Configuration → BIOS/Platform:
Boot Mode: UEFI Mode
Network Boot: Enabled
PXE Support: UEFI Only
System Configuration → UEFI Boot Order:
1. Network Adapter (NIC 1)
2. Hard Disk
Supermicro-Specific (IPMI):
BIOS Setup → Boot:
Boot mode select: UEFI
UEFI Network Stack: Enabled
Boot Option #1: UEFI Network
BIOS Setup → Advanced → CPU Configuration:
Intel Virtualization Technology: Enabled
2.5 BMC/IPMI Requirements
Mandatory Features:
- Remote power control (on/off/reset)
- Boot device selection (PXE/disk)
- Remote console access (KVM-over-IP or SOL)
Recommended Features:
- Virtual media mounting
- Sensor monitoring (temperature, fans, PSU)
- Event logging
- SMTP alerting
Network Configuration:
- Dedicated BMC network (separate VLAN recommended)
- Static IP or DHCP reservation
- HTTPS access enabled
- Default credentials changed
3. Network Setup
3.1 Network Topology
Single-Segment Topology (Simple):
┌─────────────────────────────────────────────────────┐
│ Provisioning Server PXE/DHCP/HTTP │
│ 10.0.100.10 │
└──────────────┬──────────────────────────────────────┘
│
│ Layer 2 Switch (unmanaged)
│
┬──────────┴──────────┬─────────────┬
│ │ │
┌───┴────┐ ┌────┴─────┐ ┌───┴────┐
│ Node01 │ │ Node02 │ │ Node03 │
│10.0.100│ │ 10.0.100 │ │10.0.100│
│ .50 │ │ .51 │ │ .52 │
└────────┘ └──────────┘ └────────┘
Multi-VLAN Topology (Production):
┌──────────────────────────────────────────────────────┐
│ Management Network (VLAN 10) │
│ - Provisioning Server: 10.0.10.10 │
│ - BMC/IPMI: 10.0.10.50-99 │
└──────────────────┬───────────────────────────────────┘
│
┌──────────────────┴───────────────────────────────────┐
│ Provisioning Network (VLAN 100) │
│ - PXE Boot: 10.0.100.0/24 │
│ - DHCP Range: 10.0.100.100-200 │
└──────────────────┬───────────────────────────────────┘
│
┌──────────────────┴───────────────────────────────────┐
│ Production Network (VLAN 200) │
│ - Static IPs: 10.0.200.10-99 │
│ - Service Traffic │
└──────────────────┬───────────────────────────────────┘
│
┌────────┴────────┐
│ L3 Switch │
│ (VLANs, Routing)│
└────────┬─────────┘
│
┬───────────┴──────────┬─────────┬
│ │ │
┌────┴────┐ ┌────┴────┐ │
│ Node01 │ │ Node02 │ │...
│ eth0: │ │ eth0: │
│ VLAN100│ │ VLAN100│
│ eth1: │ │ eth1: │
│ VLAN200│ │ VLAN200│
└─────────┘ └─────────┘
3.2 DHCP Server Configuration
ISC DHCP Configuration (/etc/dhcp/dhcpd.conf):
# Global options
option architecture-type code 93 = unsigned integer 16;
default-lease-time 600;
max-lease-time 7200;
authoritative;
# Provisioning subnet
subnet 10.0.100.0 netmask 255.255.255.0 {
range 10.0.100.100 10.0.100.200;
option routers 10.0.100.1;
option domain-name-servers 10.0.100.1, 8.8.8.8;
option domain-name "prov.example.com";
# PXE boot server
next-server 10.0.100.10;
# Architecture-specific boot file selection
if exists user-class and option user-class = "iPXE" {
# iPXE already loaded, provide boot script via HTTP
filename "http://10.0.100.10:8080/boot/ipxe/boot.ipxe";
} elsif option architecture-type = 00:00 {
# BIOS (legacy) - load iPXE via TFTP
filename "undionly.kpxe";
} elsif option architecture-type = 00:07 {
# UEFI x86_64 - load iPXE via TFTP
filename "ipxe.efi";
} elsif option architecture-type = 00:09 {
# UEFI x86_64 (alternate) - load iPXE via TFTP
filename "ipxe.efi";
} else {
# Fallback to UEFI
filename "ipxe.efi";
}
}
# Static reservations for control plane nodes
host node01 {
hardware ethernet 52:54:00:12:34:56;
fixed-address 10.0.100.50;
option host-name "node01";
}
host node02 {
hardware ethernet 52:54:00:12:34:57;
fixed-address 10.0.100.51;
option host-name "node02";
}
host node03 {
hardware ethernet 52:54:00:12:34:58;
fixed-address 10.0.100.52;
option host-name "node03";
}
Validation Commands:
# Test DHCP configuration syntax
sudo dhcpd -t -cf /etc/dhcp/dhcpd.conf
# Start DHCP server
sudo systemctl start isc-dhcp-server
sudo systemctl enable isc-dhcp-server
# Monitor DHCP leases
sudo tail -f /var/lib/dhcp/dhcpd.leases
# Test DHCP response
sudo nmap --script broadcast-dhcp-discover -e eth0
3.3 DNS Requirements
Forward DNS Zone (example.com):
; Control plane nodes
node01.example.com. IN A 10.0.200.10
node02.example.com. IN A 10.0.200.11
node03.example.com. IN A 10.0.200.12
; Worker nodes
worker01.example.com. IN A 10.0.200.20
worker02.example.com. IN A 10.0.200.21
; Service VIPs (optional, for load balancing)
chainfire.example.com. IN A 10.0.200.100
flaredb.example.com. IN A 10.0.200.101
iam.example.com. IN A 10.0.200.102
Reverse DNS Zone (200.0.10.in-addr.arpa):
; Control plane nodes
10.200.0.10.in-addr.arpa. IN PTR node01.example.com.
11.200.0.10.in-addr.arpa. IN PTR node02.example.com.
12.200.0.10.in-addr.arpa. IN PTR node03.example.com.
Validation:
# Test forward resolution
dig +short node01.example.com
# Test reverse resolution
dig +short -x 10.0.200.10
# Test from target node after provisioning
ssh root@10.0.100.50 'hostname -f'
3.4 Firewall Rules
Service Port Matrix (see NETWORK.md for complete reference):
| Service | API Port | Raft Port | Additional | Protocol |
|---|---|---|---|---|
| Chainfire | 2379 | 2380 | 2381 (gossip) | TCP |
| FlareDB | 2479 | 2480 | - | TCP |
| IAM | 8080 | - | - | TCP |
| PlasmaVMC | 9090 | - | - | TCP |
| NovaNET | 9091 | - | - | TCP |
| FlashDNS | 53 | - | - | TCP/UDP |
| FiberLB | 9092 | - | - | TCP |
| K8sHost | 10250 | - | - | TCP |
iptables Rules (Provisioning Server):
#!/bin/bash
# Provisioning server firewall rules
# Allow DHCP
iptables -A INPUT -p udp --dport 67 -j ACCEPT
iptables -A INPUT -p udp --dport 68 -j ACCEPT
# Allow TFTP
iptables -A INPUT -p udp --dport 69 -j ACCEPT
# Allow HTTP (boot server)
iptables -A INPUT -p tcp --dport 80 -j ACCEPT
iptables -A INPUT -p tcp --dport 8080 -j ACCEPT
# Allow SSH (for nixos-anywhere)
iptables -A INPUT -p tcp --dport 22 -j ACCEPT
iptables Rules (Cluster Nodes):
#!/bin/bash
# Cluster node firewall rules
# Allow SSH (management)
iptables -A INPUT -p tcp --dport 22 -s 10.0.0.0/8 -j ACCEPT
# Allow Chainfire (from cluster subnet only)
iptables -A INPUT -p tcp --dport 2379 -s 10.0.200.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 2380 -s 10.0.200.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 2381 -s 10.0.200.0/24 -j ACCEPT
# Allow FlareDB
iptables -A INPUT -p tcp --dport 2479 -s 10.0.200.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 2480 -s 10.0.200.0/24 -j ACCEPT
# Allow IAM (from cluster and client subnets)
iptables -A INPUT -p tcp --dport 8080 -s 10.0.0.0/8 -j ACCEPT
# Drop all other traffic
iptables -A INPUT -j DROP
nftables Rules (Modern Alternative):
#!/usr/sbin/nft -f
flush ruleset
table inet filter {
chain input {
type filter hook input priority 0; policy drop;
# Allow established connections
ct state established,related accept
# Allow loopback
iif lo accept
# Allow SSH
tcp dport 22 ip saddr 10.0.0.0/8 accept
# Allow cluster services from cluster subnet
tcp dport { 2379, 2380, 2381, 2479, 2480 } ip saddr 10.0.200.0/24 accept
# Allow IAM from internal network
tcp dport 8080 ip saddr 10.0.0.0/8 accept
}
}
3.5 Static IP Allocation Strategy
IP Allocation Plan:
10.0.100.0/24 - Provisioning network (DHCP during install)
.1 - Gateway
.10 - PXE/DHCP/HTTP server
.50-.79 - Control plane nodes (static reservations)
.80-.99 - Worker nodes (static reservations)
.100-.200 - DHCP pool (temporary during provisioning)
10.0.200.0/24 - Production network (static IPs)
.1 - Gateway
.10-.19 - Control plane nodes
.20-.99 - Worker nodes
.100-.199 - Service VIPs
3.6 Network Bandwidth Requirements
Per-Node During Provisioning:
- PXE boot: ~200-500 MB (kernel + initrd)
- nixos-anywhere: ~1-5 GB (NixOS closures)
- Time: 5-15 minutes on 1 Gbps link
Production Cluster:
- Control plane: 1 Gbps minimum, 10 Gbps recommended
- Workers: 10 Gbps minimum, 25 Gbps recommended
- Inter-node latency: <1ms ideal, <5ms acceptable
4. Pre-Deployment Checklist
Complete this checklist before beginning deployment:
4.1 Hardware Checklist
- All servers racked and powered
- All network cables connected (data + BMC)
- All power supplies connected (redundant if available)
- BMC/IPMI network configured
- BMC credentials documented
- BIOS/UEFI settings configured per section 2.4
- PXE boot enabled and first in boot order
- Secure Boot disabled (if using UEFI)
- Hardware inventory recorded (MAC addresses, serial numbers)
4.2 Network Checklist
- Network switches configured (VLANs, trunking)
- DHCP server configured and tested
- DNS forward/reverse zones created
- Firewall rules configured
- Network connectivity verified (ping tests)
- Bandwidth validated (iperf between nodes)
- DHCP relay configured (if multi-subnet)
- NTP server configured for time sync
4.3 PXE Server Checklist
- PXE server deployed (see T032.S2)
- DHCP service running and healthy
- TFTP service running and healthy
- HTTP service running and healthy
- iPXE bootloaders downloaded (undionly.kpxe, ipxe.efi)
- NixOS netboot images built and uploaded (see T032.S3)
- Boot script configured (boot.ipxe)
- Health endpoints responding
Validation:
# On PXE server
sudo systemctl status isc-dhcp-server
sudo systemctl status atftpd
sudo systemctl status nginx
# Test HTTP access
curl http://10.0.100.10:8080/boot/ipxe/boot.ipxe
curl http://10.0.100.10:8080/health
# Test TFTP access
tftp 10.0.100.10 -c get undionly.kpxe /tmp/test.kpxe
4.4 Node Configuration Checklist
- Per-node NixOS configurations created (
/srv/provisioning/nodes/) - Hardware configurations generated or templated
- Disko disk layouts defined
- Network settings configured (static IPs, VLANs)
- Service selections defined (control-plane vs worker)
- Cluster configuration JSON files created
- Node inventory documented (MAC → hostname → role)
4.5 TLS Certificates Checklist
- CA certificate generated
- Per-node certificates generated
- Certificate files copied to secrets directories
- Certificate permissions set (0400 for private keys)
- Certificate expiry dates documented
- Rotation procedure documented
Generate Certificates:
# Generate CA (if not already done)
openssl genrsa -out ca-key.pem 4096
openssl req -x509 -new -nodes -key ca-key.pem -days 3650 \
-out ca-cert.pem -subj "/CN=PlasmaCloud CA"
# Generate per-node certificate
for node in node01 node02 node03; do
openssl genrsa -out ${node}-key.pem 4096
openssl req -new -key ${node}-key.pem -out ${node}-csr.pem \
-subj "/CN=${node}.example.com"
openssl x509 -req -in ${node}-csr.pem -CA ca-cert.pem -CAkey ca-key.pem \
-CAcreateserial -out ${node}-cert.pem -days 365
done
4.6 Provisioning Workstation Checklist
- NixOS or Nix package manager installed
- Nix flakes enabled
- SSH key pair generated for provisioning
- SSH public key added to netboot images
- Network access to provisioning VLAN
- Git repository cloned (if using version control)
- nixos-anywhere installed:
nix profile install github:nix-community/nixos-anywhere
5. Deployment Workflow
5.1 Phase 1: PXE Server Setup
Reference: See /home/centra/cloud/chainfire/baremetal/pxe-server/ (T032.S2)
Step 1.1: Deploy PXE Server Using NixOS Module
Create PXE server configuration:
# /etc/nixos/pxe-server.nix
{ config, pkgs, lib, ... }:
{
imports = [
/path/to/chainfire/baremetal/pxe-server/nixos-module.nix
];
services.centra-pxe-server = {
enable = true;
interface = "eth0";
serverAddress = "10.0.100.10";
dhcp = {
subnet = "10.0.100.0";
netmask = "255.255.255.0";
broadcast = "10.0.100.255";
range = {
start = "10.0.100.100";
end = "10.0.100.200";
};
router = "10.0.100.1";
domainNameServers = [ "10.0.100.1" "8.8.8.8" ];
};
nodes = {
"52:54:00:12:34:56" = {
profile = "control-plane";
hostname = "node01";
ipAddress = "10.0.100.50";
};
"52:54:00:12:34:57" = {
profile = "control-plane";
hostname = "node02";
ipAddress = "10.0.100.51";
};
"52:54:00:12:34:58" = {
profile = "control-plane";
hostname = "node03";
ipAddress = "10.0.100.52";
};
};
};
}
Apply configuration:
sudo nixos-rebuild switch -I nixos-config=/etc/nixos/pxe-server.nix
Step 1.2: Verify PXE Services
# Check all services are running
sudo systemctl status dhcpd4.service
sudo systemctl status atftpd.service
sudo systemctl status nginx.service
# Test DHCP server
sudo journalctl -u dhcpd4 -f &
# Power on a test server and watch for DHCP requests
# Test TFTP server
tftp localhost -c get undionly.kpxe /tmp/test.kpxe
ls -lh /tmp/test.kpxe # Should show ~100KB file
# Test HTTP server
curl http://localhost:8080/health
# Expected: {"status":"healthy","services":{"dhcp":"running","tftp":"running","http":"running"}}
curl http://localhost:8080/boot/ipxe/boot.ipxe
# Expected: iPXE boot script content
5.2 Phase 2: Build Netboot Images
Reference: See /home/centra/cloud/baremetal/image-builder/ (T032.S3)
Step 2.1: Build Images for All Profiles
cd /home/centra/cloud/baremetal/image-builder
# Build all profiles
./build-images.sh
# Or build specific profile
./build-images.sh --profile control-plane
./build-images.sh --profile worker
./build-images.sh --profile all-in-one
Expected Output:
Building netboot image for control-plane...
Building initrd...
[... Nix build output ...]
✓ Build complete: artifacts/control-plane/initrd (234 MB)
✓ Build complete: artifacts/control-plane/bzImage (12 MB)
Step 2.2: Copy Images to PXE Server
# Automatic (if PXE server directory exists)
./build-images.sh --deploy
# Manual copy
sudo cp artifacts/control-plane/* /var/lib/pxe-boot/nixos/control-plane/
sudo cp artifacts/worker/* /var/lib/pxe-boot/nixos/worker/
sudo cp artifacts/all-in-one/* /var/lib/pxe-boot/nixos/all-in-one/
Step 2.3: Verify Image Integrity
# Check file sizes (should be reasonable)
ls -lh /var/lib/pxe-boot/nixos/*/
# Verify images are accessible via HTTP
curl -I http://10.0.100.10:8080/boot/nixos/control-plane/bzImage
# Expected: HTTP/1.1 200 OK, Content-Length: ~12000000
curl -I http://10.0.100.10:8080/boot/nixos/control-plane/initrd
# Expected: HTTP/1.1 200 OK, Content-Length: ~234000000
5.3 Phase 3: Prepare Node Configurations
Step 3.1: Generate Node-Specific NixOS Configs
Create directory structure:
mkdir -p /srv/provisioning/nodes/{node01,node02,node03}.example.com/{secrets,}
Node Configuration Template (nodes/node01.example.com/configuration.nix):
{ config, pkgs, lib, ... }:
{
imports = [
../../profiles/control-plane.nix
../../common/base.nix
./hardware.nix
./disko.nix
];
# Hostname and domain
networking = {
hostName = "node01";
domain = "example.com";
usePredictableInterfaceNames = false; # Use eth0, eth1
# Provisioning interface (temporary)
interfaces.eth0 = {
useDHCP = false;
ipv4.addresses = [{
address = "10.0.100.50";
prefixLength = 24;
}];
};
# Production interface
interfaces.eth1 = {
useDHCP = false;
ipv4.addresses = [{
address = "10.0.200.10";
prefixLength = 24;
}];
};
defaultGateway = "10.0.200.1";
nameservers = [ "10.0.200.1" "8.8.8.8" ];
};
# Enable PlasmaCloud services
services.chainfire = {
enable = true;
port = 2379;
raftPort = 2380;
gossipPort = 2381;
settings = {
node_id = "node01";
cluster_name = "prod-cluster";
tls = {
cert_path = "/etc/nixos/secrets/node01-cert.pem";
key_path = "/etc/nixos/secrets/node01-key.pem";
ca_path = "/etc/nixos/secrets/ca-cert.pem";
};
};
};
services.flaredb = {
enable = true;
port = 2479;
raftPort = 2480;
settings = {
node_id = "node01";
cluster_name = "prod-cluster";
chainfire_endpoint = "https://localhost:2379";
tls = {
cert_path = "/etc/nixos/secrets/node01-cert.pem";
key_path = "/etc/nixos/secrets/node01-key.pem";
ca_path = "/etc/nixos/secrets/ca-cert.pem";
};
};
};
services.iam = {
enable = true;
port = 8080;
settings = {
flaredb_endpoint = "https://localhost:2479";
tls = {
cert_path = "/etc/nixos/secrets/node01-cert.pem";
key_path = "/etc/nixos/secrets/node01-key.pem";
ca_path = "/etc/nixos/secrets/ca-cert.pem";
};
};
};
# Enable first-boot automation
services.first-boot-automation = {
enable = true;
configFile = "/etc/nixos/secrets/cluster-config.json";
};
system.stateVersion = "24.11";
}
Step 3.2: Create cluster-config.json for Each Node
Bootstrap Node (node01):
{
"node_id": "node01",
"node_role": "control-plane",
"bootstrap": true,
"cluster_name": "prod-cluster",
"leader_url": "https://node01.example.com:2379",
"raft_addr": "10.0.200.10:2380",
"initial_peers": [
"node01.example.com:2380",
"node02.example.com:2380",
"node03.example.com:2380"
],
"flaredb_peers": [
"node01.example.com:2480",
"node02.example.com:2480",
"node03.example.com:2480"
]
}
Copy to secrets:
cp cluster-config-node01.json /srv/provisioning/nodes/node01.example.com/secrets/cluster-config.json
cp cluster-config-node02.json /srv/provisioning/nodes/node02.example.com/secrets/cluster-config.json
cp cluster-config-node03.json /srv/provisioning/nodes/node03.example.com/secrets/cluster-config.json
Step 3.3: Generate Disko Disk Layouts
Simple Single-Disk Layout (nodes/node01.example.com/disko.nix):
{ disks ? [ "/dev/sda" ], ... }:
{
disko.devices = {
disk = {
main = {
type = "disk";
device = builtins.head disks;
content = {
type = "gpt";
partitions = {
ESP = {
size = "1G";
type = "EF00";
content = {
type = "filesystem";
format = "vfat";
mountpoint = "/boot";
};
};
root = {
size = "100%";
content = {
type = "filesystem";
format = "ext4";
mountpoint = "/";
};
};
};
};
};
};
};
}
Step 3.4: Pre-Generate TLS Certificates
# Copy per-node certificates
cp ca-cert.pem /srv/provisioning/nodes/node01.example.com/secrets/
cp node01-cert.pem /srv/provisioning/nodes/node01.example.com/secrets/
cp node01-key.pem /srv/provisioning/nodes/node01.example.com/secrets/
# Set permissions
chmod 644 /srv/provisioning/nodes/node01.example.com/secrets/*-cert.pem
chmod 644 /srv/provisioning/nodes/node01.example.com/secrets/ca-cert.pem
chmod 600 /srv/provisioning/nodes/node01.example.com/secrets/*-key.pem
5.4 Phase 4: Bootstrap First 3 Nodes
Step 4.1: Power On Nodes via BMC
# Using ipmitool (example for Dell/HP/Supermicro)
for ip in 10.0.10.50 10.0.10.51 10.0.10.52; do
ipmitool -I lanplus -H $ip -U admin -P password chassis bootdev pxe options=persistent
ipmitool -I lanplus -H $ip -U admin -P password chassis power on
done
Step 4.2: Verify PXE Boot Success
Watch serial console (if available):
# Connect via IPMI SOL
ipmitool -I lanplus -H 10.0.10.50 -U admin -P password sol activate
# Expected output:
# ... DHCP discovery ...
# ... TFTP download undionly.kpxe or ipxe.efi ...
# ... iPXE menu appears ...
# ... Kernel and initrd download ...
# ... NixOS installer boots ...
# ... SSH server starts ...
Verify installer is ready:
# Wait for nodes to appear in DHCP leases
sudo tail -f /var/lib/dhcp/dhcpd.leases
# Test SSH connectivity
ssh root@10.0.100.50 'uname -a'
# Expected: Linux node01 ... nixos
Step 4.3: Run nixos-anywhere Simultaneously on All 3
Create provisioning script:
#!/bin/bash
# /srv/provisioning/scripts/provision-bootstrap-nodes.sh
set -euo pipefail
NODES=("node01" "node02" "node03")
PROVISION_IPS=("10.0.100.50" "10.0.100.51" "10.0.100.52")
FLAKE_ROOT="/srv/provisioning"
for i in "${!NODES[@]}"; do
node="${NODES[$i]}"
ip="${PROVISION_IPS[$i]}"
echo "Provisioning $node at $ip..."
nix run github:nix-community/nixos-anywhere -- \
--flake "$FLAKE_ROOT#$node" \
--build-on-remote \
root@$ip &
done
wait
echo "All nodes provisioned successfully!"
Run provisioning:
chmod +x /srv/provisioning/scripts/provision-bootstrap-nodes.sh
./provision-bootstrap-nodes.sh
Expected output per node:
Provisioning node01 at 10.0.100.50...
Connecting via SSH...
Running disko to partition disks...
Building NixOS system...
Installing bootloader...
Copying secrets...
Installation complete. Rebooting...
Step 4.4: Wait for First-Boot Automation
After reboot, nodes will boot from disk and run first-boot automation. Monitor progress:
# Watch logs on node01 (via SSH after it reboots)
ssh root@10.0.200.10 # Note: now on production network
# Check cluster join services
journalctl -u chainfire-cluster-join.service -f
journalctl -u flaredb-cluster-join.service -f
# Expected log output:
# {"level":"INFO","message":"Waiting for local chainfire service..."}
# {"level":"INFO","message":"Local chainfire healthy"}
# {"level":"INFO","message":"Bootstrap node, cluster initialized"}
# {"level":"INFO","message":"Cluster join complete"}
Step 4.5: Verify Cluster Health
# Check Chainfire cluster
curl -k https://node01.example.com:2379/admin/cluster/members | jq
# Expected output:
# {
# "members": [
# {"id":"node01","raft_addr":"10.0.200.10:2380","status":"healthy","role":"leader"},
# {"id":"node02","raft_addr":"10.0.200.11:2380","status":"healthy","role":"follower"},
# {"id":"node03","raft_addr":"10.0.200.12:2380","status":"healthy","role":"follower"}
# ]
# }
# Check FlareDB cluster
curl -k https://node01.example.com:2479/admin/cluster/members | jq
# Check IAM service
curl -k https://node01.example.com:8080/health | jq
# Expected: {"status":"healthy","database":"connected"}
5.5 Phase 5: Add Additional Nodes
Step 5.1: Prepare Join-Mode Configurations
Create configuration for node04 (worker profile):
{
"node_id": "node04",
"node_role": "worker",
"bootstrap": false,
"cluster_name": "prod-cluster",
"leader_url": "https://node01.example.com:2379",
"raft_addr": "10.0.200.20:2380"
}
Step 5.2: Power On and Provision Nodes
# Power on node via BMC
ipmitool -I lanplus -H 10.0.10.54 -U admin -P password chassis bootdev pxe
ipmitool -I lanplus -H 10.0.10.54 -U admin -P password chassis power on
# Wait for PXE boot and SSH ready
sleep 60
# Provision node
nix run github:nix-community/nixos-anywhere -- \
--flake /srv/provisioning#node04 \
--build-on-remote \
root@10.0.100.60
Step 5.3: Verify Cluster Join via API
# Check cluster members (should include node04)
curl -k https://node01.example.com:2379/admin/cluster/members | jq '.members[] | select(.id=="node04")'
# Expected:
# {"id":"node04","raft_addr":"10.0.200.20:2380","status":"healthy","role":"follower"}
Step 5.4: Validate Replication and Service Distribution
# Write test data on leader
curl -k -X PUT https://node01.example.com:2379/v1/kv/test \
-H "Content-Type: application/json" \
-d '{"value":"hello world"}'
# Read from follower (should be replicated)
curl -k https://node02.example.com:2379/v1/kv/test | jq
# Expected: {"key":"test","value":"hello world"}
6. Verification & Validation
6.1 Health Check Commands for All Services
Chainfire:
curl -k https://node01.example.com:2379/health | jq
# Expected: {"status":"healthy","raft":"leader","cluster_size":3}
# Check cluster membership
curl -k https://node01.example.com:2379/admin/cluster/members | jq '.members | length'
# Expected: 3 (for initial bootstrap)
FlareDB:
curl -k https://node01.example.com:2479/health | jq
# Expected: {"status":"healthy","raft":"leader","chainfire":"connected"}
# Query test metric
curl -k https://node01.example.com:2479/v1/query \
-H "Content-Type: application/json" \
-d '{"query":"up{job=\"node\"}","time":"now"}'
IAM:
curl -k https://node01.example.com:8080/health | jq
# Expected: {"status":"healthy","database":"connected","version":"1.0.0"}
# List users (requires authentication)
curl -k https://node01.example.com:8080/api/users \
-H "Authorization: Bearer $IAM_TOKEN" | jq
PlasmaVMC:
curl -k https://node01.example.com:9090/health | jq
# Expected: {"status":"healthy","vms_running":0}
# List VMs
curl -k https://node01.example.com:9090/api/vms | jq
NovaNET:
curl -k https://node01.example.com:9091/health | jq
# Expected: {"status":"healthy","networks":0}
FlashDNS:
dig @node01.example.com example.com
# Expected: DNS response with ANSWER section
# Health check
curl -k https://node01.example.com:853/health | jq
FiberLB:
curl -k https://node01.example.com:9092/health | jq
# Expected: {"status":"healthy","backends":0}
K8sHost:
kubectl --kubeconfig=/etc/kubernetes/admin.conf get nodes
# Expected: Node list including this node
6.2 Cluster Membership Verification
#!/bin/bash
# /srv/provisioning/scripts/verify-cluster.sh
echo "Checking Chainfire cluster..."
curl -k https://node01.example.com:2379/admin/cluster/members | jq '.members[] | {id, status, role}'
echo ""
echo "Checking FlareDB cluster..."
curl -k https://node01.example.com:2479/admin/cluster/members | jq '.members[] | {id, status, role}'
echo ""
echo "Cluster health summary:"
echo " Chainfire nodes: $(curl -sk https://node01.example.com:2379/admin/cluster/members | jq '.members | length')"
echo " FlareDB nodes: $(curl -sk https://node01.example.com:2479/admin/cluster/members | jq '.members | length')"
echo " Raft leaders: Chainfire=$(curl -sk https://node01.example.com:2379/admin/cluster/members | jq -r '.members[] | select(.role=="leader") | .id'), FlareDB=$(curl -sk https://node01.example.com:2479/admin/cluster/members | jq -r '.members[] | select(.role=="leader") | .id')"
6.3 Raft Leader Election Check
# Identify current leader
LEADER=$(curl -sk https://node01.example.com:2379/admin/cluster/members | jq -r '.members[] | select(.role=="leader") | .id')
echo "Current Chainfire leader: $LEADER"
# Verify all followers can reach leader
for node in node01 node02 node03; do
echo "Checking $node..."
curl -sk https://$node.example.com:2379/admin/cluster/leader | jq
done
6.4 TLS Certificate Validation
# Check certificate expiry
for node in node01 node02 node03; do
echo "Checking $node certificate..."
echo | openssl s_client -connect $node.example.com:2379 2>/dev/null | openssl x509 -noout -dates
done
# Verify certificate chain
echo | openssl s_client -connect node01.example.com:2379 -CAfile /srv/provisioning/ca-cert.pem -verify 1
# Expected: Verify return code: 0 (ok)
6.5 Network Connectivity Tests
# Test inter-node connectivity (from node01)
ssh root@node01.example.com '
for node in node02 node03; do
echo "Testing connectivity to $node..."
nc -zv $node.example.com 2379
nc -zv $node.example.com 2380
done
'
# Test bandwidth (iperf3)
ssh root@node02.example.com 'iperf3 -s' &
ssh root@node01.example.com 'iperf3 -c node02.example.com -t 10'
# Expected: ~10 Gbps on 10GbE, ~1 Gbps on 1GbE
6.6 Performance Smoke Tests
Chainfire Write Performance:
# 1000 writes
time for i in {1..1000}; do
curl -sk -X PUT https://node01.example.com:2379/v1/kv/test$i \
-H "Content-Type: application/json" \
-d "{\"value\":\"test data $i\"}" > /dev/null
done
# Expected: <10 seconds on healthy cluster
FlareDB Query Performance:
# Insert test metrics
curl -k -X POST https://node01.example.com:2479/v1/write \
-H "Content-Type: application/json" \
-d '{"metric":"test_metric","value":42,"timestamp":"'$(date -Iseconds)'"}'
# Query performance
time curl -k https://node01.example.com:2479/v1/query \
-H "Content-Type: application/json" \
-d '{"query":"test_metric","start":"1h","end":"now"}'
# Expected: <100ms response time
7. Common Operations
7.1 Adding a New Node
Step 1: Prepare Node Configuration
# Create node directory
mkdir -p /srv/provisioning/nodes/node05.example.com/secrets
# Copy template configuration
cp /srv/provisioning/nodes/node01.example.com/configuration.nix \
/srv/provisioning/nodes/node05.example.com/
# Edit for new node
vim /srv/provisioning/nodes/node05.example.com/configuration.nix
# Update: hostName, ipAddresses, node_id
Step 2: Generate Cluster Config (Join Mode)
{
"node_id": "node05",
"node_role": "worker",
"bootstrap": false,
"cluster_name": "prod-cluster",
"leader_url": "https://node01.example.com:2379",
"raft_addr": "10.0.200.21:2380"
}
Step 3: Provision Node
# Power on and PXE boot
ipmitool -I lanplus -H 10.0.10.55 -U admin -P password chassis bootdev pxe
ipmitool -I lanplus -H 10.0.10.55 -U admin -P password chassis power on
# Wait for SSH
sleep 60
# Run nixos-anywhere
nix run github:nix-community/nixos-anywhere -- \
--flake /srv/provisioning#node05 \
root@10.0.100.65
Step 4: Verify Join
# Check cluster membership
curl -k https://node01.example.com:2379/admin/cluster/members | jq '.members[] | select(.id=="node05")'
7.2 Replacing a Failed Node
Step 1: Remove Failed Node from Cluster
# Remove from Chainfire cluster
curl -k -X DELETE https://node01.example.com:2379/admin/member/node02
# Remove from FlareDB cluster
curl -k -X DELETE https://node01.example.com:2479/admin/member/node02
Step 2: Physically Replace Hardware
- Power off old node
- Remove from rack
- Install new node
- Connect all cables
- Configure BMC
Step 3: Provision Replacement Node
# Use same node ID and configuration
nix run github:nix-community/nixos-anywhere -- \
--flake /srv/provisioning#node02 \
root@10.0.100.51
Step 4: Verify Rejoin
# Cluster should automatically add node during first-boot
curl -k https://node01.example.com:2379/admin/cluster/members | jq
7.3 Updating Node Configuration
Step 1: Edit Configuration
vim /srv/provisioning/nodes/node01.example.com/configuration.nix
# Make changes (e.g., add service, change network config)
Step 2: Build and Deploy
# Build configuration locally
nix build /srv/provisioning#node01
# Deploy to node (from node or remote)
nixos-rebuild switch --flake /srv/provisioning#node01
Step 3: Verify Changes
# Check active configuration
ssh root@node01.example.com 'nixos-rebuild list-generations'
# Test services still healthy
curl -k https://node01.example.com:2379/health | jq
7.4 Rolling Updates
Update Process (One Node at a Time):
#!/bin/bash
# /srv/provisioning/scripts/rolling-update.sh
NODES=("node01" "node02" "node03")
for node in "${NODES[@]}"; do
echo "Updating $node..."
# Build new configuration
nix build /srv/provisioning#$node
# Deploy (test mode first)
ssh root@$node.example.com "nixos-rebuild test --flake /srv/provisioning#$node"
# Verify health
if ! curl -k https://$node.example.com:2379/health | jq -e '.status == "healthy"'; then
echo "ERROR: $node unhealthy after test, aborting"
ssh root@$node.example.com "nixos-rebuild switch --rollback"
exit 1
fi
# Apply permanently
ssh root@$node.example.com "nixos-rebuild switch --flake /srv/provisioning#$node"
# Wait for reboot if kernel changed
echo "Waiting 30s for stabilization..."
sleep 30
# Final health check
curl -k https://$node.example.com:2379/health | jq
echo "$node updated successfully"
done
7.5 Draining a Node for Maintenance
Step 1: Mark Node for Drain
# Disable node in load balancer (if using one)
curl -k -X POST https://node01.example.com:9092/api/backend/node02 \
-d '{"status":"drain"}'
Step 2: Migrate VMs (PlasmaVMC)
# List VMs on node
ssh root@node02.example.com 'systemctl list-units | grep plasmavmc-vm@'
# Migrate each VM
curl -k -X POST https://node01.example.com:9090/api/vms/vm-001/migrate \
-d '{"target_node":"node03"}'
Step 3: Stop Services
ssh root@node02.example.com '
systemctl stop plasmavmc.service
systemctl stop chainfire.service
systemctl stop flaredb.service
'
Step 4: Perform Maintenance
# Reboot for kernel update, hardware maintenance, etc.
ssh root@node02.example.com 'reboot'
Step 5: Re-enable Node
# Verify all services healthy
ssh root@node02.example.com 'systemctl status chainfire flaredb plasmavmc'
# Re-enable in load balancer
curl -k -X POST https://node01.example.com:9092/api/backend/node02 \
-d '{"status":"active"}'
7.6 Decommissioning a Node
Step 1: Drain Node (see 7.5)
Step 2: Remove from Cluster
# Remove from Chainfire
curl -k -X DELETE https://node01.example.com:2379/admin/member/node02
# Remove from FlareDB
curl -k -X DELETE https://node01.example.com:2479/admin/member/node02
# Verify removal
curl -k https://node01.example.com:2379/admin/cluster/members | jq
Step 3: Power Off
# Via BMC
ipmitool -I lanplus -H 10.0.10.51 -U admin -P password chassis power off
# Or via SSH
ssh root@node02.example.com 'poweroff'
Step 4: Update Inventory
# Remove from node inventory
vim /srv/provisioning/inventory.json
# Remove node02 entry
# Remove from DNS
# Update DNS zone to remove node02.example.com
# Remove from monitoring
# Update Prometheus targets to remove node02
8. Troubleshooting
8.1 PXE Boot Failures
Symptom: Server does not obtain IP address or does not boot from network
Diagnosis:
# Monitor DHCP server logs
sudo journalctl -u dhcpd4 -f
# Monitor TFTP requests
sudo tcpdump -i eth0 -n port 69
# Check PXE server services
sudo systemctl status dhcpd4 atftpd nginx
Common Causes:
- DHCP server not running:
sudo systemctl start dhcpd4 - Wrong network interface: Check
interfacesin dhcpd.conf - Firewall blocking DHCP/TFTP:
sudo iptables -L -n | grep -E "67|68|69" - PXE not enabled in BIOS: Enter BIOS and enable Network Boot
- Network cable disconnected: Check physical connection
Solution:
# Restart all PXE services
sudo systemctl restart dhcpd4 atftpd nginx
# Verify DHCP configuration
sudo dhcpd -t -cf /etc/dhcp/dhcpd.conf
# Test TFTP
tftp localhost -c get undionly.kpxe /tmp/test.kpxe
# Power cycle server
ipmitool -I lanplus -H <bmc-ip> -U admin chassis power cycle
8.2 Installation Failures (nixos-anywhere)
Symptom: nixos-anywhere fails during disk partitioning, installation, or bootloader setup
Diagnosis:
# Check nixos-anywhere output for errors
# Common errors: disk not found, partition table errors, out of space
# SSH to installer for manual inspection
ssh root@10.0.100.50
# Check disk status
lsblk
dmesg | grep -i error
Common Causes:
- Disk device wrong: Update disko.nix with correct device (e.g., /dev/nvme0n1)
- Disk not wiped: Previous partition table conflicts
- Out of disk space: Insufficient storage for Nix closures
- Network issues: Cannot download packages from binary cache
Solution:
# Manual disk wipe (on installer)
ssh root@10.0.100.50 '
wipefs -a /dev/sda
sgdisk --zap-all /dev/sda
'
# Retry nixos-anywhere
nix run github:nix-community/nixos-anywhere -- \
--flake /srv/provisioning#node01 \
--debug \
root@10.0.100.50
8.3 Cluster Join Failures
Symptom: Node boots successfully but does not join cluster
Diagnosis:
# Check first-boot logs on node
ssh root@node01.example.com 'journalctl -u chainfire-cluster-join.service -u flaredb-cluster-join.service'
# Common errors:
# - "Health check timeout after 120s"
# - "Join request failed: connection refused"
# - "Configuration file not found"
Bootstrap Mode vs Join Mode:
- Bootstrap: Node expects to create new cluster with peers
- Join: Node expects to connect to existing leader
Common Causes:
- Wrong bootstrap flag: Check cluster-config.json
- Leader unreachable: Network/firewall issue
- TLS certificate errors: Verify cert paths and validity
- Service not starting: Check main service (chainfire.service)
Solution:
# Verify cluster-config.json
ssh root@node01.example.com 'cat /etc/nixos/secrets/cluster-config.json | jq'
# Test leader connectivity
ssh root@node04.example.com 'curl -k https://node01.example.com:2379/health'
# Check TLS certificates
ssh root@node04.example.com 'ls -l /etc/nixos/secrets/*.pem'
# Manual cluster join (if automation fails)
curl -k -X POST https://node01.example.com:2379/admin/member/add \
-H "Content-Type: application/json" \
-d '{"id":"node04","raft_addr":"10.0.200.20:2380"}'
8.4 Service Start Failures
Symptom: Service fails to start after boot
Diagnosis:
# Check service status
ssh root@node01.example.com 'systemctl status chainfire.service'
# View logs
ssh root@node01.example.com 'journalctl -u chainfire.service -n 100'
# Common errors:
# - "bind: address already in use" (port conflict)
# - "certificate verify failed" (TLS issue)
# - "permission denied" (file permissions)
Common Causes:
- Port already in use: Another service using same port
- Missing dependencies: Required service not running
- Configuration error: Invalid config file
- File permissions: Cannot read secrets
Solution:
# Check port usage
ssh root@node01.example.com 'ss -tlnp | grep 2379'
# Verify dependencies
ssh root@node01.example.com 'systemctl list-dependencies chainfire.service'
# Test configuration manually
ssh root@node01.example.com 'chainfire-server --config /etc/nixos/chainfire.toml --check-config'
# Fix permissions
ssh root@node01.example.com 'chmod 600 /etc/nixos/secrets/*-key.pem'
8.5 Network Connectivity Issues
Symptom: Nodes cannot communicate with each other or external services
Diagnosis:
# Test basic connectivity
ssh root@node01.example.com 'ping -c 3 node02.example.com'
# Test specific ports
ssh root@node01.example.com 'nc -zv node02.example.com 2379'
# Check firewall rules
ssh root@node01.example.com 'iptables -L -n | grep 2379'
# Check routing
ssh root@node01.example.com 'ip route show'
Common Causes:
- Firewall blocking traffic: Missing iptables rules
- Wrong IP address: Configuration mismatch
- Network interface down: Interface not configured
- DNS resolution failure: Cannot resolve hostnames
Solution:
# Add firewall rules
ssh root@node01.example.com '
iptables -A INPUT -p tcp --dport 2379 -s 10.0.200.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 2380 -s 10.0.200.0/24 -j ACCEPT
iptables-save > /etc/iptables/rules.v4
'
# Fix DNS resolution
ssh root@node01.example.com '
echo "10.0.200.11 node02.example.com node02" >> /etc/hosts
'
# Restart networking
ssh root@node01.example.com 'systemctl restart systemd-networkd'
8.6 TLS Certificate Errors
Symptom: Services cannot establish TLS connections
Diagnosis:
# Test TLS connection
openssl s_client -connect node01.example.com:2379 -CAfile /srv/provisioning/ca-cert.pem
# Check certificate validity
ssh root@node01.example.com '
openssl x509 -in /etc/nixos/secrets/node01-cert.pem -noout -dates
'
# Common errors:
# - "certificate verify failed" (wrong CA)
# - "certificate has expired" (cert expired)
# - "certificate subject name mismatch" (wrong CN)
Common Causes:
- Expired certificate: Regenerate certificate
- Wrong CA certificate: Verify CA cert is correct
- Hostname mismatch: CN does not match hostname
- File permissions: Cannot read certificate files
Solution:
# Regenerate certificate
openssl req -new -key /srv/provisioning/secrets/node01-key.pem \
-out /srv/provisioning/secrets/node01-csr.pem \
-subj "/CN=node01.example.com"
openssl x509 -req -in /srv/provisioning/secrets/node01-csr.pem \
-CA /srv/provisioning/ca-cert.pem \
-CAkey /srv/provisioning/ca-key.pem \
-CAcreateserial \
-out /srv/provisioning/secrets/node01-cert.pem \
-days 365
# Copy to node
scp /srv/provisioning/secrets/node01-cert.pem root@node01.example.com:/etc/nixos/secrets/
# Restart service
ssh root@node01.example.com 'systemctl restart chainfire.service'
8.7 Performance Degradation
Symptom: Services are slow or unresponsive
Diagnosis:
# Check system load
ssh root@node01.example.com 'uptime'
ssh root@node01.example.com 'top -bn1 | head -20'
# Check disk I/O
ssh root@node01.example.com 'iostat -x 1 5'
# Check network bandwidth
ssh root@node01.example.com 'iftop -i eth1'
# Check Raft logs for slow operations
ssh root@node01.example.com 'journalctl -u chainfire.service | grep "slow operation"'
Common Causes:
- High CPU usage: Too many requests, inefficient queries
- Disk I/O bottleneck: Slow disk, too many writes
- Network saturation: Bandwidth exhausted
- Memory pressure: OOM killer active
- Raft slow commits: Network latency between nodes
Solution:
# Add more resources (vertical scaling)
# Or add more nodes (horizontal scaling)
# Check for resource leaks
ssh root@node01.example.com 'systemctl status chainfire | grep Memory'
# Restart service to clear memory leaks (temporary)
ssh root@node01.example.com 'systemctl restart chainfire.service'
# Optimize disk I/O (enable write caching if safe)
ssh root@node01.example.com 'hdparm -W1 /dev/sda'
9. Rollback & Recovery
9.1 NixOS Generation Rollback
NixOS provides atomic rollback capability via generations:
List Available Generations:
ssh root@node01.example.com 'nixos-rebuild list-generations'
# Example output:
# 1 2025-12-10 10:30:00
# 2 2025-12-10 12:45:00 (current)
Rollback to Previous Generation:
# Rollback and reboot
ssh root@node01.example.com 'nixos-rebuild switch --rollback'
# Or boot into previous generation once (no permanent change)
ssh root@node01.example.com 'nixos-rebuild boot --rollback && reboot'
Rollback to Specific Generation:
ssh root@node01.example.com 'nix-env --switch-generation 1 -p /nix/var/nix/profiles/system'
ssh root@node01.example.com 'reboot'
9.2 Re-Provisioning from PXE
Complete re-provisioning wipes all data and reinstalls from scratch:
Step 1: Remove Node from Cluster
curl -k -X DELETE https://node01.example.com:2379/admin/member/node02
curl -k -X DELETE https://node01.example.com:2479/admin/member/node02
Step 2: Set Boot to PXE
ipmitool -I lanplus -H 10.0.10.51 -U admin chassis bootdev pxe
Step 3: Reboot Node
ssh root@node02.example.com 'reboot'
# Or via BMC
ipmitool -I lanplus -H 10.0.10.51 -U admin chassis power cycle
Step 4: Run nixos-anywhere
# Wait for PXE boot and SSH ready
sleep 90
nix run github:nix-community/nixos-anywhere -- \
--flake /srv/provisioning#node02 \
root@10.0.100.51
9.3 Disaster Recovery Procedures
Complete Cluster Loss (All Nodes Down):
Step 1: Restore from Backup (if available)
# Restore Chainfire data
ssh root@node01.example.com '
systemctl stop chainfire.service
rm -rf /var/lib/chainfire/*
tar -xzf /backup/chainfire-$(date +%Y%m%d).tar.gz -C /var/lib/chainfire/
systemctl start chainfire.service
'
Step 2: Bootstrap New Cluster If no backup, re-provision all nodes as bootstrap:
# Update cluster-config.json for all nodes
# Set bootstrap=true, same initial_peers
# Provision all 3 nodes
for node in node01 node02 node03; do
nix run github:nix-community/nixos-anywhere -- \
--flake /srv/provisioning#$node \
root@<node-ip> &
done
wait
Single Node Failure:
Step 1: Verify Cluster Quorum
# Check remaining nodes have quorum
curl -k https://node01.example.com:2379/admin/cluster/members | jq '.members | length'
# Expected: 2 (if 3-node cluster with 1 failure)
Step 2: Remove Failed Node
curl -k -X DELETE https://node01.example.com:2379/admin/member/node02
Step 3: Provision Replacement
# Use same node ID and configuration
nix run github:nix-community/nixos-anywhere -- \
--flake /srv/provisioning#node02 \
root@10.0.100.51
9.4 Backup and Restore
Automated Backup Script:
#!/bin/bash
# /srv/provisioning/scripts/backup-cluster.sh
BACKUP_DIR="/backup/cluster-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BACKUP_DIR"
# Backup Chainfire data
for node in node01 node02 node03; do
ssh root@$node.example.com \
"tar -czf - /var/lib/chainfire" > "$BACKUP_DIR/chainfire-$node.tar.gz"
done
# Backup FlareDB data
for node in node01 node02 node03; do
ssh root@$node.example.com \
"tar -czf - /var/lib/flaredb" > "$BACKUP_DIR/flaredb-$node.tar.gz"
done
# Backup configurations
cp -r /srv/provisioning/nodes "$BACKUP_DIR/configs"
echo "Backup complete: $BACKUP_DIR"
Restore Script:
#!/bin/bash
# /srv/provisioning/scripts/restore-cluster.sh
BACKUP_DIR="$1"
if [ -z "$BACKUP_DIR" ]; then
echo "Usage: $0 <backup-dir>"
exit 1
fi
# Stop services on all nodes
for node in node01 node02 node03; do
ssh root@$node.example.com 'systemctl stop chainfire flaredb'
done
# Restore Chainfire data
for node in node01 node02 node03; do
cat "$BACKUP_DIR/chainfire-$node.tar.gz" | \
ssh root@$node.example.com "cd / && tar -xzf -"
done
# Restore FlareDB data
for node in node01 node02 node03; do
cat "$BACKUP_DIR/flaredb-$node.tar.gz" | \
ssh root@$node.example.com "cd / && tar -xzf -"
done
# Restart services
for node in node01 node02 node03; do
ssh root@$node.example.com 'systemctl start chainfire flaredb'
done
echo "Restore complete"
10. Security Best Practices
10.1 SSH Key Management
Generate Dedicated Provisioning Key:
ssh-keygen -t ed25519 -C "provisioning@example.com" -f ~/.ssh/id_ed25519_provisioning
Add to Netboot Image:
# In netboot-base.nix
users.users.root.openssh.authorizedKeys.keys = [
"ssh-ed25519 AAAAC3Nza... provisioning@example.com"
];
Rotate Keys Regularly:
# Generate new key
ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519_provisioning_new
# Add to all nodes
for node in node01 node02 node03; do
ssh-copy-id -i ~/.ssh/id_ed25519_provisioning_new.pub root@$node.example.com
done
# Remove old key from authorized_keys
# Update netboot image with new key
10.2 TLS Certificate Rotation
Automated Rotation Script:
#!/bin/bash
# /srv/provisioning/scripts/rotate-certs.sh
# Generate new certificates
for node in node01 node02 node03; do
openssl genrsa -out ${node}-key-new.pem 4096
openssl req -new -key ${node}-key-new.pem -out ${node}-csr.pem \
-subj "/CN=${node}.example.com"
openssl x509 -req -in ${node}-csr.pem \
-CA ca-cert.pem -CAkey ca-key.pem \
-CAcreateserial -out ${node}-cert-new.pem -days 365
done
# Deploy new certificates (without restarting services yet)
for node in node01 node02 node03; do
scp ${node}-cert-new.pem root@${node}.example.com:/etc/nixos/secrets/${node}-cert-new.pem
scp ${node}-key-new.pem root@${node}.example.com:/etc/nixos/secrets/${node}-key-new.pem
done
# Update configuration to use new certs
# ... (NixOS configuration update) ...
# Rolling restart to apply new certificates
for node in node01 node02 node03; do
ssh root@${node}.example.com 'systemctl restart chainfire flaredb iam'
sleep 30 # Wait for stabilization
done
echo "Certificate rotation complete"
10.3 Secrets Management
Best Practices:
- Store secrets outside Nix store (use
/etc/nixos/secrets/) - Set restrictive permissions (0600 for private keys, 0400 for passwords)
- Use environment variables for runtime secrets
- Never commit secrets to Git
- Use encrypted secrets (sops-nix or agenix)
Example with sops-nix:
# In configuration.nix
{
imports = [ <sops-nix/modules/sops> ];
sops.defaultSopsFile = ./secrets.yaml;
sops.secrets."node01/tls-key" = {
owner = "chainfire";
mode = "0400";
};
services.chainfire.settings.tls.key_path = config.sops.secrets."node01/tls-key".path;
}
10.4 Network Isolation
VLAN Segmentation:
- Management VLAN (10): BMC/IPMI, provisioning workstation
- Provisioning VLAN (100): PXE boot, temporary
- Production VLAN (200): Cluster services, inter-node communication
- Client VLAN (300): External clients accessing services
Firewall Zones:
# Example nftables rules
table inet filter {
chain input {
type filter hook input priority 0; policy drop;
# Management from trusted subnet only
iifname "eth0" ip saddr 10.0.10.0/24 tcp dport 22 accept
# Cluster traffic from cluster subnet only
iifname "eth1" ip saddr 10.0.200.0/24 tcp dport { 2379, 2380, 2479, 2480 } accept
# Client traffic from client subnet only
iifname "eth2" ip saddr 10.0.300.0/24 tcp dport { 8080, 9090 } accept
}
}
10.5 Audit Logging
Enable Structured Logging:
# In configuration.nix
services.chainfire.settings.logging = {
level = "info";
format = "json";
output = "journal";
};
# Enable journald forwarding to SIEM
services.journald.extraConfig = ''
ForwardToSyslog=yes
Storage=persistent
MaxRetentionSec=7days
'';
Audit Key Events:
- Cluster membership changes
- Node joins/leaves
- Authentication failures
- Configuration changes
- TLS certificate errors
Log Aggregation:
# Forward logs to central logging server
# Example: rsyslog configuration
cat > /etc/rsyslog.d/50-remote.conf <<EOF
*.* @@logging-server.example.com:514
EOF
systemctl restart rsyslog
Appendix A: Service Port Reference
See NETWORK.md for complete port matrix.
Appendix B: Hardware Vendor Commands
See HARDWARE.md for vendor-specific BIOS configurations and IPMI commands.
Appendix C: Complete Command Reference
See COMMANDS.md for all commands organized by task.
Appendix D: Quick Reference Cards
See QUICKSTART.md for condensed deployment guide.
Appendix E: Deployment Flow Diagrams
See diagrams/deployment-flow.md for visual workflow.
Appendix F: Related Documentation
- Design Document:
/home/centra/cloud/docs/por/T032-baremetal-provisioning/design.md - PXE Server:
/home/centra/cloud/chainfire/baremetal/pxe-server/README.md - Image Builder:
/home/centra/cloud/baremetal/image-builder/README.md - First-Boot Automation:
/home/centra/cloud/baremetal/first-boot/README.md
End of Operator Runbook