- Replace form_urlencoded with RFC 3986 compliant URI encoding - Implement aws_uri_encode() matching AWS SigV4 spec exactly - Unreserved chars (A-Z,a-z,0-9,-,_,.,~) not encoded - All other chars percent-encoded with uppercase hex - Preserve slashes in paths, encode in query params - Normalize empty paths to '/' per AWS spec - Fix test expectations (body hash, HMAC values) - Add comprehensive SigV4 signature determinism test This fixes the canonicalization mismatch that caused signature validation failures in T047. Auth can now be enabled for production. Refs: T058.S1
2178 lines
62 KiB
Markdown
2178 lines
62 KiB
Markdown
# Bare-Metal Provisioning Operator Runbook
|
|
|
|
**Document Version:** 1.0
|
|
**Last Updated:** 2025-12-10
|
|
**Status:** Production Ready
|
|
**Author:** PlasmaCloud Infrastructure Team
|
|
|
|
## 1. Overview
|
|
|
|
### 1.1 What This Runbook Covers
|
|
|
|
This runbook provides comprehensive, step-by-step instructions for deploying PlasmaCloud infrastructure on bare-metal servers using automated PXE-based provisioning. By following this guide, operators will be able to:
|
|
|
|
- Deploy a complete PlasmaCloud cluster from bare hardware to running services
|
|
- Bootstrap a 3-node Raft cluster (Chainfire + FlareDB)
|
|
- Add additional nodes to an existing cluster
|
|
- Validate cluster health and troubleshoot common issues
|
|
- Perform operational tasks (updates, maintenance, recovery)
|
|
|
|
### 1.2 Prerequisites
|
|
|
|
**Required Access and Permissions:**
|
|
- Root/sudo access on provisioning server
|
|
- Physical or IPMI/BMC access to bare-metal servers
|
|
- Network access to provisioning VLAN
|
|
- SSH key pair for nixos-anywhere
|
|
|
|
**Required Tools:**
|
|
- NixOS with flakes enabled (provisioning workstation)
|
|
- curl, jq, ssh client
|
|
- ipmitool (optional, for remote management)
|
|
- Serial console access tool (optional)
|
|
|
|
**Required Knowledge:**
|
|
- Basic understanding of PXE boot process
|
|
- Linux system administration
|
|
- Network configuration (DHCP, DNS, firewall)
|
|
- NixOS basics (declarative configuration, flakes)
|
|
|
|
### 1.3 Architecture Diagram
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ Bare-Metal Provisioning Flow │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
|
|
Phase 1: PXE Boot Phase 2: Installation
|
|
┌──────────────┐ ┌──────────────────┐
|
|
│ Bare-Metal │ 1. DHCP Request │ DHCP Server │
|
|
│ Server ├─────────────────>│ (PXE Server) │
|
|
│ │ └──────────────────┘
|
|
│ (powered │ 2. TFTP Get │
|
|
│ on, PXE │ bootloader │
|
|
│ enabled) │<───────────────────────────┘
|
|
│ │
|
|
│ 3. iPXE │ 4. HTTP Get ┌──────────────────┐
|
|
│ loads │ boot.ipxe │ HTTP Server │
|
|
│ ├──────────────────>│ (nginx) │
|
|
│ │ └──────────────────┘
|
|
│ 5. iPXE │ 6. HTTP Get │
|
|
│ menu │ kernel+initrd │
|
|
│ │<───────────────────────────┘
|
|
│ │
|
|
│ 7. Boot │
|
|
│ NixOS │
|
|
│ Installer│
|
|
└──────┬───────┘
|
|
│
|
|
│ 8. SSH Connection ┌──────────────────┐
|
|
└───────────────────────────>│ Provisioning │
|
|
│ Workstation │
|
|
│ │
|
|
│ 9. Run │
|
|
│ nixos- │
|
|
│ anywhere │
|
|
└──────┬───────────┘
|
|
│
|
|
┌────────────────────┴────────────────────┐
|
|
│ │
|
|
v v
|
|
┌──────────────────────────┐ ┌──────────────────────────┐
|
|
│ 10. Partition disks │ │ 11. Install NixOS │
|
|
│ (disko) │ │ - Build system │
|
|
│ - GPT/LVM/LUKS │ │ - Copy closures │
|
|
│ - Format filesystems │ │ - Install bootloader│
|
|
│ - Mount /mnt │ │ - Inject secrets │
|
|
└──────────────────────────┘ └──────────────────────────┘
|
|
|
|
Phase 3: First Boot Phase 4: Running Cluster
|
|
┌──────────────┐ ┌──────────────────┐
|
|
│ Bare-Metal │ 12. Reboot │ NixOS System │
|
|
│ Server │ ────────────> │ (from disk) │
|
|
└──────────────┘ └──────────────────┘
|
|
│
|
|
┌───────────────────┴────────────────────┐
|
|
│ 13. First-boot automation │
|
|
│ - Chainfire cluster join/bootstrap │
|
|
│ - FlareDB cluster join/bootstrap │
|
|
│ - IAM initialization │
|
|
│ - Health checks │
|
|
└───────────────────┬────────────────────┘
|
|
│
|
|
v
|
|
┌──────────────────┐
|
|
│ Running Cluster │
|
|
│ - All services │
|
|
│ healthy │
|
|
│ - Raft quorum │
|
|
│ - TLS enabled │
|
|
└──────────────────┘
|
|
```
|
|
|
|
## 2. Hardware Requirements
|
|
|
|
### 2.1 Minimum Specifications Per Node
|
|
|
|
**Control Plane Nodes (3-5 recommended):**
|
|
- CPU: 8 cores / 16 threads (Intel Xeon or AMD EPYC)
|
|
- RAM: 32 GB DDR4 ECC
|
|
- Storage: 500 GB SSD (NVMe preferred)
|
|
- Network: 2x 10 GbE (bonded/redundant)
|
|
- BMC: IPMI 2.0 or Redfish compatible
|
|
|
|
**Worker Nodes:**
|
|
- CPU: 16+ cores / 32+ threads
|
|
- RAM: 64 GB+ DDR4 ECC
|
|
- Storage: 1 TB+ NVMe SSD
|
|
- Network: 2x 10 GbE or 2x 25 GbE
|
|
- BMC: IPMI 2.0 or Redfish compatible
|
|
|
|
**All-in-One (Development/Testing):**
|
|
- CPU: 16 cores / 32 threads
|
|
- RAM: 64 GB DDR4
|
|
- Storage: 1 TB SSD
|
|
- Network: 1x 10 GbE (minimum)
|
|
- BMC: Optional but recommended
|
|
|
|
### 2.2 Recommended Production Specifications
|
|
|
|
**Control Plane Nodes:**
|
|
- CPU: 16-32 cores (Intel Xeon Gold/Platinum or AMD EPYC)
|
|
- RAM: 64-128 GB DDR4 ECC
|
|
- Storage: 1-2 TB NVMe SSD (RAID1 for redundancy)
|
|
- Network: 2x 25 GbE (active/active bonding)
|
|
- BMC: Redfish with SOL (Serial-over-LAN)
|
|
|
|
**Worker Nodes:**
|
|
- CPU: 32-64 cores
|
|
- RAM: 128-256 GB DDR4 ECC
|
|
- Storage: 2-4 TB NVMe SSD
|
|
- Network: 2x 25 GbE or 2x 100 GbE
|
|
- GPU: Optional (NVIDIA/AMD for ML workloads)
|
|
|
|
### 2.3 Hardware Compatibility Matrix
|
|
|
|
| Vendor | Model | Tested | BIOS | UEFI | Notes |
|
|
|-----------|---------------|--------|------|------|--------------------------------|
|
|
| Dell | PowerEdge R640| Yes | Yes | Yes | Requires BIOS A19+ |
|
|
| Dell | PowerEdge R650| Yes | Yes | Yes | Best PXE compatibility |
|
|
| HPE | ProLiant DL360| Yes | Yes | Yes | Disable Secure Boot |
|
|
| HPE | ProLiant DL380| Yes | Yes | Yes | Latest firmware recommended |
|
|
| Supermicro| SYS-2029U | Yes | Yes | Yes | Requires BMC 1.73+ |
|
|
| Lenovo | ThinkSystem | Partial| Yes | Yes | Some NIC issues on older models|
|
|
| Generic | Whitebox x86 | Partial| Yes | Maybe| UEFI support varies |
|
|
|
|
### 2.4 BIOS/UEFI Settings
|
|
|
|
**Required Settings:**
|
|
- Boot Mode: UEFI (preferred) or Legacy BIOS
|
|
- PXE/Network Boot: Enabled on primary NIC
|
|
- Boot Order: Network → Disk
|
|
- Secure Boot: Disabled (for PXE boot)
|
|
- Virtualization: Enabled (VT-x/AMD-V)
|
|
- SR-IOV: Enabled (if using advanced networking)
|
|
|
|
**Dell-Specific (iDRAC):**
|
|
```
|
|
System BIOS → Boot Settings:
|
|
Boot Mode: UEFI
|
|
UEFI Network Stack: Enabled
|
|
PXE Device 1: Integrated NIC 1
|
|
|
|
System BIOS → System Profile:
|
|
Profile: Performance
|
|
```
|
|
|
|
**HPE-Specific (iLO):**
|
|
```
|
|
System Configuration → BIOS/Platform:
|
|
Boot Mode: UEFI Mode
|
|
Network Boot: Enabled
|
|
PXE Support: UEFI Only
|
|
|
|
System Configuration → UEFI Boot Order:
|
|
1. Network Adapter (NIC 1)
|
|
2. Hard Disk
|
|
```
|
|
|
|
**Supermicro-Specific (IPMI):**
|
|
```
|
|
BIOS Setup → Boot:
|
|
Boot mode select: UEFI
|
|
UEFI Network Stack: Enabled
|
|
Boot Option #1: UEFI Network
|
|
|
|
BIOS Setup → Advanced → CPU Configuration:
|
|
Intel Virtualization Technology: Enabled
|
|
```
|
|
|
|
### 2.5 BMC/IPMI Requirements
|
|
|
|
**Mandatory Features:**
|
|
- Remote power control (on/off/reset)
|
|
- Boot device selection (PXE/disk)
|
|
- Remote console access (KVM-over-IP or SOL)
|
|
|
|
**Recommended Features:**
|
|
- Virtual media mounting
|
|
- Sensor monitoring (temperature, fans, PSU)
|
|
- Event logging
|
|
- SMTP alerting
|
|
|
|
**Network Configuration:**
|
|
- Dedicated BMC network (separate VLAN recommended)
|
|
- Static IP or DHCP reservation
|
|
- HTTPS access enabled
|
|
- Default credentials changed
|
|
|
|
## 3. Network Setup
|
|
|
|
### 3.1 Network Topology
|
|
|
|
**Single-Segment Topology (Simple):**
|
|
```
|
|
┌─────────────────────────────────────────────────────┐
|
|
│ Provisioning Server PXE/DHCP/HTTP │
|
|
│ 10.0.100.10 │
|
|
└──────────────┬──────────────────────────────────────┘
|
|
│
|
|
│ Layer 2 Switch (unmanaged)
|
|
│
|
|
┬──────────┴──────────┬─────────────┬
|
|
│ │ │
|
|
┌───┴────┐ ┌────┴─────┐ ┌───┴────┐
|
|
│ Node01 │ │ Node02 │ │ Node03 │
|
|
│10.0.100│ │ 10.0.100 │ │10.0.100│
|
|
│ .50 │ │ .51 │ │ .52 │
|
|
└────────┘ └──────────┘ └────────┘
|
|
```
|
|
|
|
**Multi-VLAN Topology (Production):**
|
|
```
|
|
┌──────────────────────────────────────────────────────┐
|
|
│ Management Network (VLAN 10) │
|
|
│ - Provisioning Server: 10.0.10.10 │
|
|
│ - BMC/IPMI: 10.0.10.50-99 │
|
|
└──────────────────┬───────────────────────────────────┘
|
|
│
|
|
┌──────────────────┴───────────────────────────────────┐
|
|
│ Provisioning Network (VLAN 100) │
|
|
│ - PXE Boot: 10.0.100.0/24 │
|
|
│ - DHCP Range: 10.0.100.100-200 │
|
|
└──────────────────┬───────────────────────────────────┘
|
|
│
|
|
┌──────────────────┴───────────────────────────────────┐
|
|
│ Production Network (VLAN 200) │
|
|
│ - Static IPs: 10.0.200.10-99 │
|
|
│ - Service Traffic │
|
|
└──────────────────┬───────────────────────────────────┘
|
|
│
|
|
┌────────┴────────┐
|
|
│ L3 Switch │
|
|
│ (VLANs, Routing)│
|
|
└────────┬─────────┘
|
|
│
|
|
┬───────────┴──────────┬─────────┬
|
|
│ │ │
|
|
┌────┴────┐ ┌────┴────┐ │
|
|
│ Node01 │ │ Node02 │ │...
|
|
│ eth0: │ │ eth0: │
|
|
│ VLAN100│ │ VLAN100│
|
|
│ eth1: │ │ eth1: │
|
|
│ VLAN200│ │ VLAN200│
|
|
└─────────┘ └─────────┘
|
|
```
|
|
|
|
### 3.2 DHCP Server Configuration
|
|
|
|
**ISC DHCP Configuration (`/etc/dhcp/dhcpd.conf`):**
|
|
|
|
```dhcp
|
|
# Global options
|
|
option architecture-type code 93 = unsigned integer 16;
|
|
default-lease-time 600;
|
|
max-lease-time 7200;
|
|
authoritative;
|
|
|
|
# Provisioning subnet
|
|
subnet 10.0.100.0 netmask 255.255.255.0 {
|
|
range 10.0.100.100 10.0.100.200;
|
|
option routers 10.0.100.1;
|
|
option domain-name-servers 10.0.100.1, 8.8.8.8;
|
|
option domain-name "prov.example.com";
|
|
|
|
# PXE boot server
|
|
next-server 10.0.100.10;
|
|
|
|
# Architecture-specific boot file selection
|
|
if exists user-class and option user-class = "iPXE" {
|
|
# iPXE already loaded, provide boot script via HTTP
|
|
filename "http://10.0.100.10:8080/boot/ipxe/boot.ipxe";
|
|
} elsif option architecture-type = 00:00 {
|
|
# BIOS (legacy) - load iPXE via TFTP
|
|
filename "undionly.kpxe";
|
|
} elsif option architecture-type = 00:07 {
|
|
# UEFI x86_64 - load iPXE via TFTP
|
|
filename "ipxe.efi";
|
|
} elsif option architecture-type = 00:09 {
|
|
# UEFI x86_64 (alternate) - load iPXE via TFTP
|
|
filename "ipxe.efi";
|
|
} else {
|
|
# Fallback to UEFI
|
|
filename "ipxe.efi";
|
|
}
|
|
}
|
|
|
|
# Static reservations for control plane nodes
|
|
host node01 {
|
|
hardware ethernet 52:54:00:12:34:56;
|
|
fixed-address 10.0.100.50;
|
|
option host-name "node01";
|
|
}
|
|
|
|
host node02 {
|
|
hardware ethernet 52:54:00:12:34:57;
|
|
fixed-address 10.0.100.51;
|
|
option host-name "node02";
|
|
}
|
|
|
|
host node03 {
|
|
hardware ethernet 52:54:00:12:34:58;
|
|
fixed-address 10.0.100.52;
|
|
option host-name "node03";
|
|
}
|
|
```
|
|
|
|
**Validation Commands:**
|
|
```bash
|
|
# Test DHCP configuration syntax
|
|
sudo dhcpd -t -cf /etc/dhcp/dhcpd.conf
|
|
|
|
# Start DHCP server
|
|
sudo systemctl start isc-dhcp-server
|
|
sudo systemctl enable isc-dhcp-server
|
|
|
|
# Monitor DHCP leases
|
|
sudo tail -f /var/lib/dhcp/dhcpd.leases
|
|
|
|
# Test DHCP response
|
|
sudo nmap --script broadcast-dhcp-discover -e eth0
|
|
```
|
|
|
|
### 3.3 DNS Requirements
|
|
|
|
**Forward DNS Zone (`example.com`):**
|
|
```zone
|
|
; Control plane nodes
|
|
node01.example.com. IN A 10.0.200.10
|
|
node02.example.com. IN A 10.0.200.11
|
|
node03.example.com. IN A 10.0.200.12
|
|
|
|
; Worker nodes
|
|
worker01.example.com. IN A 10.0.200.20
|
|
worker02.example.com. IN A 10.0.200.21
|
|
|
|
; Service VIPs (optional, for load balancing)
|
|
chainfire.example.com. IN A 10.0.200.100
|
|
flaredb.example.com. IN A 10.0.200.101
|
|
iam.example.com. IN A 10.0.200.102
|
|
```
|
|
|
|
**Reverse DNS Zone (`200.0.10.in-addr.arpa`):**
|
|
```zone
|
|
; Control plane nodes
|
|
10.200.0.10.in-addr.arpa. IN PTR node01.example.com.
|
|
11.200.0.10.in-addr.arpa. IN PTR node02.example.com.
|
|
12.200.0.10.in-addr.arpa. IN PTR node03.example.com.
|
|
```
|
|
|
|
**Validation:**
|
|
```bash
|
|
# Test forward resolution
|
|
dig +short node01.example.com
|
|
|
|
# Test reverse resolution
|
|
dig +short -x 10.0.200.10
|
|
|
|
# Test from target node after provisioning
|
|
ssh root@10.0.100.50 'hostname -f'
|
|
```
|
|
|
|
### 3.4 Firewall Rules
|
|
|
|
**Service Port Matrix (see NETWORK.md for complete reference):**
|
|
|
|
| Service | API Port | Raft Port | Additional | Protocol |
|
|
|--------------|----------|-----------|------------|----------|
|
|
| Chainfire | 2379 | 2380 | 2381 (gossip) | TCP |
|
|
| FlareDB | 2479 | 2480 | - | TCP |
|
|
| IAM | 8080 | - | - | TCP |
|
|
| PlasmaVMC | 9090 | - | - | TCP |
|
|
| PrismNET | 9091 | - | - | TCP |
|
|
| FlashDNS | 53 | - | - | TCP/UDP |
|
|
| FiberLB | 9092 | - | - | TCP |
|
|
| K8sHost | 10250 | - | - | TCP |
|
|
|
|
**iptables Rules (Provisioning Server):**
|
|
```bash
|
|
#!/bin/bash
|
|
# Provisioning server firewall rules
|
|
|
|
# Allow DHCP
|
|
iptables -A INPUT -p udp --dport 67 -j ACCEPT
|
|
iptables -A INPUT -p udp --dport 68 -j ACCEPT
|
|
|
|
# Allow TFTP
|
|
iptables -A INPUT -p udp --dport 69 -j ACCEPT
|
|
|
|
# Allow HTTP (boot server)
|
|
iptables -A INPUT -p tcp --dport 80 -j ACCEPT
|
|
iptables -A INPUT -p tcp --dport 8080 -j ACCEPT
|
|
|
|
# Allow SSH (for nixos-anywhere)
|
|
iptables -A INPUT -p tcp --dport 22 -j ACCEPT
|
|
```
|
|
|
|
**iptables Rules (Cluster Nodes):**
|
|
```bash
|
|
#!/bin/bash
|
|
# Cluster node firewall rules
|
|
|
|
# Allow SSH (management)
|
|
iptables -A INPUT -p tcp --dport 22 -s 10.0.0.0/8 -j ACCEPT
|
|
|
|
# Allow Chainfire (from cluster subnet only)
|
|
iptables -A INPUT -p tcp --dport 2379 -s 10.0.200.0/24 -j ACCEPT
|
|
iptables -A INPUT -p tcp --dport 2380 -s 10.0.200.0/24 -j ACCEPT
|
|
iptables -A INPUT -p tcp --dport 2381 -s 10.0.200.0/24 -j ACCEPT
|
|
|
|
# Allow FlareDB
|
|
iptables -A INPUT -p tcp --dport 2479 -s 10.0.200.0/24 -j ACCEPT
|
|
iptables -A INPUT -p tcp --dport 2480 -s 10.0.200.0/24 -j ACCEPT
|
|
|
|
# Allow IAM (from cluster and client subnets)
|
|
iptables -A INPUT -p tcp --dport 8080 -s 10.0.0.0/8 -j ACCEPT
|
|
|
|
# Drop all other traffic
|
|
iptables -A INPUT -j DROP
|
|
```
|
|
|
|
**nftables Rules (Modern Alternative):**
|
|
```nft
|
|
#!/usr/sbin/nft -f
|
|
|
|
flush ruleset
|
|
|
|
table inet filter {
|
|
chain input {
|
|
type filter hook input priority 0; policy drop;
|
|
|
|
# Allow established connections
|
|
ct state established,related accept
|
|
|
|
# Allow loopback
|
|
iif lo accept
|
|
|
|
# Allow SSH
|
|
tcp dport 22 ip saddr 10.0.0.0/8 accept
|
|
|
|
# Allow cluster services from cluster subnet
|
|
tcp dport { 2379, 2380, 2381, 2479, 2480 } ip saddr 10.0.200.0/24 accept
|
|
|
|
# Allow IAM from internal network
|
|
tcp dport 8080 ip saddr 10.0.0.0/8 accept
|
|
}
|
|
}
|
|
```
|
|
|
|
### 3.5 Static IP Allocation Strategy
|
|
|
|
**IP Allocation Plan:**
|
|
```
|
|
10.0.100.0/24 - Provisioning network (DHCP during install)
|
|
.1 - Gateway
|
|
.10 - PXE/DHCP/HTTP server
|
|
.50-.79 - Control plane nodes (static reservations)
|
|
.80-.99 - Worker nodes (static reservations)
|
|
.100-.200 - DHCP pool (temporary during provisioning)
|
|
|
|
10.0.200.0/24 - Production network (static IPs)
|
|
.1 - Gateway
|
|
.10-.19 - Control plane nodes
|
|
.20-.99 - Worker nodes
|
|
.100-.199 - Service VIPs
|
|
```
|
|
|
|
### 3.6 Network Bandwidth Requirements
|
|
|
|
**Per-Node During Provisioning:**
|
|
- PXE boot: ~200-500 MB (kernel + initrd)
|
|
- nixos-anywhere: ~1-5 GB (NixOS closures)
|
|
- Time: 5-15 minutes on 1 Gbps link
|
|
|
|
**Production Cluster:**
|
|
- Control plane: 1 Gbps minimum, 10 Gbps recommended
|
|
- Workers: 10 Gbps minimum, 25 Gbps recommended
|
|
- Inter-node latency: <1ms ideal, <5ms acceptable
|
|
|
|
## 4. Pre-Deployment Checklist
|
|
|
|
Complete this checklist before beginning deployment:
|
|
|
|
### 4.1 Hardware Checklist
|
|
|
|
- [ ] All servers racked and powered
|
|
- [ ] All network cables connected (data + BMC)
|
|
- [ ] All power supplies connected (redundant if available)
|
|
- [ ] BMC/IPMI network configured
|
|
- [ ] BMC credentials documented
|
|
- [ ] BIOS/UEFI settings configured per section 2.4
|
|
- [ ] PXE boot enabled and first in boot order
|
|
- [ ] Secure Boot disabled (if using UEFI)
|
|
- [ ] Hardware inventory recorded (MAC addresses, serial numbers)
|
|
|
|
### 4.2 Network Checklist
|
|
|
|
- [ ] Network switches configured (VLANs, trunking)
|
|
- [ ] DHCP server configured and tested
|
|
- [ ] DNS forward/reverse zones created
|
|
- [ ] Firewall rules configured
|
|
- [ ] Network connectivity verified (ping tests)
|
|
- [ ] Bandwidth validated (iperf between nodes)
|
|
- [ ] DHCP relay configured (if multi-subnet)
|
|
- [ ] NTP server configured for time sync
|
|
|
|
### 4.3 PXE Server Checklist
|
|
|
|
- [ ] PXE server deployed (see T032.S2)
|
|
- [ ] DHCP service running and healthy
|
|
- [ ] TFTP service running and healthy
|
|
- [ ] HTTP service running and healthy
|
|
- [ ] iPXE bootloaders downloaded (undionly.kpxe, ipxe.efi)
|
|
- [ ] NixOS netboot images built and uploaded (see T032.S3)
|
|
- [ ] Boot script configured (boot.ipxe)
|
|
- [ ] Health endpoints responding
|
|
|
|
**Validation:**
|
|
```bash
|
|
# On PXE server
|
|
sudo systemctl status isc-dhcp-server
|
|
sudo systemctl status atftpd
|
|
sudo systemctl status nginx
|
|
|
|
# Test HTTP access
|
|
curl http://10.0.100.10:8080/boot/ipxe/boot.ipxe
|
|
curl http://10.0.100.10:8080/health
|
|
|
|
# Test TFTP access
|
|
tftp 10.0.100.10 -c get undionly.kpxe /tmp/test.kpxe
|
|
```
|
|
|
|
### 4.4 Node Configuration Checklist
|
|
|
|
- [ ] Per-node NixOS configurations created (`/srv/provisioning/nodes/`)
|
|
- [ ] Hardware configurations generated or templated
|
|
- [ ] Disko disk layouts defined
|
|
- [ ] Network settings configured (static IPs, VLANs)
|
|
- [ ] Service selections defined (control-plane vs worker)
|
|
- [ ] Cluster configuration JSON files created
|
|
- [ ] Node inventory documented (MAC → hostname → role)
|
|
|
|
### 4.5 TLS Certificates Checklist
|
|
|
|
- [ ] CA certificate generated
|
|
- [ ] Per-node certificates generated
|
|
- [ ] Certificate files copied to secrets directories
|
|
- [ ] Certificate permissions set (0400 for private keys)
|
|
- [ ] Certificate expiry dates documented
|
|
- [ ] Rotation procedure documented
|
|
|
|
**Generate Certificates:**
|
|
```bash
|
|
# Generate CA (if not already done)
|
|
openssl genrsa -out ca-key.pem 4096
|
|
openssl req -x509 -new -nodes -key ca-key.pem -days 3650 \
|
|
-out ca-cert.pem -subj "/CN=PlasmaCloud CA"
|
|
|
|
# Generate per-node certificate
|
|
for node in node01 node02 node03; do
|
|
openssl genrsa -out ${node}-key.pem 4096
|
|
openssl req -new -key ${node}-key.pem -out ${node}-csr.pem \
|
|
-subj "/CN=${node}.example.com"
|
|
openssl x509 -req -in ${node}-csr.pem -CA ca-cert.pem -CAkey ca-key.pem \
|
|
-CAcreateserial -out ${node}-cert.pem -days 365
|
|
done
|
|
```
|
|
|
|
### 4.6 Provisioning Workstation Checklist
|
|
|
|
- [ ] NixOS or Nix package manager installed
|
|
- [ ] Nix flakes enabled
|
|
- [ ] SSH key pair generated for provisioning
|
|
- [ ] SSH public key added to netboot images
|
|
- [ ] Network access to provisioning VLAN
|
|
- [ ] Git repository cloned (if using version control)
|
|
- [ ] nixos-anywhere installed: `nix profile install github:nix-community/nixos-anywhere`
|
|
|
|
## 5. Deployment Workflow
|
|
|
|
### 5.1 Phase 1: PXE Server Setup
|
|
|
|
**Reference:** See `/home/centra/cloud/chainfire/baremetal/pxe-server/` (T032.S2)
|
|
|
|
**Step 1.1: Deploy PXE Server Using NixOS Module**
|
|
|
|
Create PXE server configuration:
|
|
```nix
|
|
# /etc/nixos/pxe-server.nix
|
|
{ config, pkgs, lib, ... }:
|
|
|
|
{
|
|
imports = [
|
|
/path/to/chainfire/baremetal/pxe-server/nixos-module.nix
|
|
];
|
|
|
|
services.centra-pxe-server = {
|
|
enable = true;
|
|
interface = "eth0";
|
|
serverAddress = "10.0.100.10";
|
|
|
|
dhcp = {
|
|
subnet = "10.0.100.0";
|
|
netmask = "255.255.255.0";
|
|
broadcast = "10.0.100.255";
|
|
range = {
|
|
start = "10.0.100.100";
|
|
end = "10.0.100.200";
|
|
};
|
|
router = "10.0.100.1";
|
|
domainNameServers = [ "10.0.100.1" "8.8.8.8" ];
|
|
};
|
|
|
|
nodes = {
|
|
"52:54:00:12:34:56" = {
|
|
profile = "control-plane";
|
|
hostname = "node01";
|
|
ipAddress = "10.0.100.50";
|
|
};
|
|
"52:54:00:12:34:57" = {
|
|
profile = "control-plane";
|
|
hostname = "node02";
|
|
ipAddress = "10.0.100.51";
|
|
};
|
|
"52:54:00:12:34:58" = {
|
|
profile = "control-plane";
|
|
hostname = "node03";
|
|
ipAddress = "10.0.100.52";
|
|
};
|
|
};
|
|
};
|
|
}
|
|
```
|
|
|
|
Apply configuration:
|
|
```bash
|
|
sudo nixos-rebuild switch -I nixos-config=/etc/nixos/pxe-server.nix
|
|
```
|
|
|
|
**Step 1.2: Verify PXE Services**
|
|
|
|
```bash
|
|
# Check all services are running
|
|
sudo systemctl status dhcpd4.service
|
|
sudo systemctl status atftpd.service
|
|
sudo systemctl status nginx.service
|
|
|
|
# Test DHCP server
|
|
sudo journalctl -u dhcpd4 -f &
|
|
# Power on a test server and watch for DHCP requests
|
|
|
|
# Test TFTP server
|
|
tftp localhost -c get undionly.kpxe /tmp/test.kpxe
|
|
ls -lh /tmp/test.kpxe # Should show ~100KB file
|
|
|
|
# Test HTTP server
|
|
curl http://localhost:8080/health
|
|
# Expected: {"status":"healthy","services":{"dhcp":"running","tftp":"running","http":"running"}}
|
|
|
|
curl http://localhost:8080/boot/ipxe/boot.ipxe
|
|
# Expected: iPXE boot script content
|
|
```
|
|
|
|
### 5.2 Phase 2: Build Netboot Images
|
|
|
|
**Reference:** See `/home/centra/cloud/baremetal/image-builder/` (T032.S3)
|
|
|
|
**Step 2.1: Build Images for All Profiles**
|
|
|
|
```bash
|
|
cd /home/centra/cloud/baremetal/image-builder
|
|
|
|
# Build all profiles
|
|
./build-images.sh
|
|
|
|
# Or build specific profile
|
|
./build-images.sh --profile control-plane
|
|
./build-images.sh --profile worker
|
|
./build-images.sh --profile all-in-one
|
|
```
|
|
|
|
**Expected Output:**
|
|
```
|
|
Building netboot image for control-plane...
|
|
Building initrd...
|
|
[... Nix build output ...]
|
|
✓ Build complete: artifacts/control-plane/initrd (234 MB)
|
|
✓ Build complete: artifacts/control-plane/bzImage (12 MB)
|
|
```
|
|
|
|
**Step 2.2: Copy Images to PXE Server**
|
|
|
|
```bash
|
|
# Automatic (if PXE server directory exists)
|
|
./build-images.sh --deploy
|
|
|
|
# Manual copy
|
|
sudo cp artifacts/control-plane/* /var/lib/pxe-boot/nixos/control-plane/
|
|
sudo cp artifacts/worker/* /var/lib/pxe-boot/nixos/worker/
|
|
sudo cp artifacts/all-in-one/* /var/lib/pxe-boot/nixos/all-in-one/
|
|
```
|
|
|
|
**Step 2.3: Verify Image Integrity**
|
|
|
|
```bash
|
|
# Check file sizes (should be reasonable)
|
|
ls -lh /var/lib/pxe-boot/nixos/*/
|
|
|
|
# Verify images are accessible via HTTP
|
|
curl -I http://10.0.100.10:8080/boot/nixos/control-plane/bzImage
|
|
# Expected: HTTP/1.1 200 OK, Content-Length: ~12000000
|
|
|
|
curl -I http://10.0.100.10:8080/boot/nixos/control-plane/initrd
|
|
# Expected: HTTP/1.1 200 OK, Content-Length: ~234000000
|
|
```
|
|
|
|
### 5.3 Phase 3: Prepare Node Configurations
|
|
|
|
**Step 3.1: Generate Node-Specific NixOS Configs**
|
|
|
|
Create directory structure:
|
|
```bash
|
|
mkdir -p /srv/provisioning/nodes/{node01,node02,node03}.example.com/{secrets,}
|
|
```
|
|
|
|
**Node Configuration Template (`nodes/node01.example.com/configuration.nix`):**
|
|
```nix
|
|
{ config, pkgs, lib, ... }:
|
|
|
|
{
|
|
imports = [
|
|
../../profiles/control-plane.nix
|
|
../../common/base.nix
|
|
./hardware.nix
|
|
./disko.nix
|
|
];
|
|
|
|
# Hostname and domain
|
|
networking = {
|
|
hostName = "node01";
|
|
domain = "example.com";
|
|
usePredictableInterfaceNames = false; # Use eth0, eth1
|
|
|
|
# Provisioning interface (temporary)
|
|
interfaces.eth0 = {
|
|
useDHCP = false;
|
|
ipv4.addresses = [{
|
|
address = "10.0.100.50";
|
|
prefixLength = 24;
|
|
}];
|
|
};
|
|
|
|
# Production interface
|
|
interfaces.eth1 = {
|
|
useDHCP = false;
|
|
ipv4.addresses = [{
|
|
address = "10.0.200.10";
|
|
prefixLength = 24;
|
|
}];
|
|
};
|
|
|
|
defaultGateway = "10.0.200.1";
|
|
nameservers = [ "10.0.200.1" "8.8.8.8" ];
|
|
};
|
|
|
|
# Enable PlasmaCloud services
|
|
services.chainfire = {
|
|
enable = true;
|
|
port = 2379;
|
|
raftPort = 2380;
|
|
gossipPort = 2381;
|
|
settings = {
|
|
node_id = "node01";
|
|
cluster_name = "prod-cluster";
|
|
tls = {
|
|
cert_path = "/etc/nixos/secrets/node01-cert.pem";
|
|
key_path = "/etc/nixos/secrets/node01-key.pem";
|
|
ca_path = "/etc/nixos/secrets/ca-cert.pem";
|
|
};
|
|
};
|
|
};
|
|
|
|
services.flaredb = {
|
|
enable = true;
|
|
port = 2479;
|
|
raftPort = 2480;
|
|
settings = {
|
|
node_id = "node01";
|
|
cluster_name = "prod-cluster";
|
|
chainfire_endpoint = "https://localhost:2379";
|
|
tls = {
|
|
cert_path = "/etc/nixos/secrets/node01-cert.pem";
|
|
key_path = "/etc/nixos/secrets/node01-key.pem";
|
|
ca_path = "/etc/nixos/secrets/ca-cert.pem";
|
|
};
|
|
};
|
|
};
|
|
|
|
services.iam = {
|
|
enable = true;
|
|
port = 8080;
|
|
settings = {
|
|
flaredb_endpoint = "https://localhost:2479";
|
|
tls = {
|
|
cert_path = "/etc/nixos/secrets/node01-cert.pem";
|
|
key_path = "/etc/nixos/secrets/node01-key.pem";
|
|
ca_path = "/etc/nixos/secrets/ca-cert.pem";
|
|
};
|
|
};
|
|
};
|
|
|
|
# Enable first-boot automation
|
|
services.first-boot-automation = {
|
|
enable = true;
|
|
configFile = "/etc/nixos/secrets/cluster-config.json";
|
|
};
|
|
|
|
system.stateVersion = "24.11";
|
|
}
|
|
```
|
|
|
|
**Step 3.2: Create cluster-config.json for Each Node**
|
|
|
|
**Bootstrap Node (node01):**
|
|
```json
|
|
{
|
|
"node_id": "node01",
|
|
"node_role": "control-plane",
|
|
"bootstrap": true,
|
|
"cluster_name": "prod-cluster",
|
|
"leader_url": "https://node01.example.com:2379",
|
|
"raft_addr": "10.0.200.10:2380",
|
|
"initial_peers": [
|
|
"node01.example.com:2380",
|
|
"node02.example.com:2380",
|
|
"node03.example.com:2380"
|
|
],
|
|
"flaredb_peers": [
|
|
"node01.example.com:2480",
|
|
"node02.example.com:2480",
|
|
"node03.example.com:2480"
|
|
]
|
|
}
|
|
```
|
|
|
|
Copy to secrets:
|
|
```bash
|
|
cp cluster-config-node01.json /srv/provisioning/nodes/node01.example.com/secrets/cluster-config.json
|
|
cp cluster-config-node02.json /srv/provisioning/nodes/node02.example.com/secrets/cluster-config.json
|
|
cp cluster-config-node03.json /srv/provisioning/nodes/node03.example.com/secrets/cluster-config.json
|
|
```
|
|
|
|
**Step 3.3: Generate Disko Disk Layouts**
|
|
|
|
**Simple Single-Disk Layout (`nodes/node01.example.com/disko.nix`):**
|
|
```nix
|
|
{ disks ? [ "/dev/sda" ], ... }:
|
|
{
|
|
disko.devices = {
|
|
disk = {
|
|
main = {
|
|
type = "disk";
|
|
device = builtins.head disks;
|
|
content = {
|
|
type = "gpt";
|
|
partitions = {
|
|
ESP = {
|
|
size = "1G";
|
|
type = "EF00";
|
|
content = {
|
|
type = "filesystem";
|
|
format = "vfat";
|
|
mountpoint = "/boot";
|
|
};
|
|
};
|
|
root = {
|
|
size = "100%";
|
|
content = {
|
|
type = "filesystem";
|
|
format = "ext4";
|
|
mountpoint = "/";
|
|
};
|
|
};
|
|
};
|
|
};
|
|
};
|
|
};
|
|
};
|
|
}
|
|
```
|
|
|
|
**Step 3.4: Pre-Generate TLS Certificates**
|
|
|
|
```bash
|
|
# Copy per-node certificates
|
|
cp ca-cert.pem /srv/provisioning/nodes/node01.example.com/secrets/
|
|
cp node01-cert.pem /srv/provisioning/nodes/node01.example.com/secrets/
|
|
cp node01-key.pem /srv/provisioning/nodes/node01.example.com/secrets/
|
|
|
|
# Set permissions
|
|
chmod 644 /srv/provisioning/nodes/node01.example.com/secrets/*-cert.pem
|
|
chmod 644 /srv/provisioning/nodes/node01.example.com/secrets/ca-cert.pem
|
|
chmod 600 /srv/provisioning/nodes/node01.example.com/secrets/*-key.pem
|
|
```
|
|
|
|
### 5.4 Phase 4: Bootstrap First 3 Nodes
|
|
|
|
**Step 4.1: Power On Nodes via BMC**
|
|
|
|
```bash
|
|
# Using ipmitool (example for Dell/HP/Supermicro)
|
|
for ip in 10.0.10.50 10.0.10.51 10.0.10.52; do
|
|
ipmitool -I lanplus -H $ip -U admin -P password chassis bootdev pxe options=persistent
|
|
ipmitool -I lanplus -H $ip -U admin -P password chassis power on
|
|
done
|
|
```
|
|
|
|
**Step 4.2: Verify PXE Boot Success**
|
|
|
|
Watch serial console (if available):
|
|
```bash
|
|
# Connect via IPMI SOL
|
|
ipmitool -I lanplus -H 10.0.10.50 -U admin -P password sol activate
|
|
|
|
# Expected output:
|
|
# ... DHCP discovery ...
|
|
# ... TFTP download undionly.kpxe or ipxe.efi ...
|
|
# ... iPXE menu appears ...
|
|
# ... Kernel and initrd download ...
|
|
# ... NixOS installer boots ...
|
|
# ... SSH server starts ...
|
|
```
|
|
|
|
Verify installer is ready:
|
|
```bash
|
|
# Wait for nodes to appear in DHCP leases
|
|
sudo tail -f /var/lib/dhcp/dhcpd.leases
|
|
|
|
# Test SSH connectivity
|
|
ssh root@10.0.100.50 'uname -a'
|
|
# Expected: Linux node01 ... nixos
|
|
```
|
|
|
|
**Step 4.3: Run nixos-anywhere Simultaneously on All 3**
|
|
|
|
Create provisioning script:
|
|
```bash
|
|
#!/bin/bash
|
|
# /srv/provisioning/scripts/provision-bootstrap-nodes.sh
|
|
|
|
set -euo pipefail
|
|
|
|
NODES=("node01" "node02" "node03")
|
|
PROVISION_IPS=("10.0.100.50" "10.0.100.51" "10.0.100.52")
|
|
FLAKE_ROOT="/srv/provisioning"
|
|
|
|
for i in "${!NODES[@]}"; do
|
|
node="${NODES[$i]}"
|
|
ip="${PROVISION_IPS[$i]}"
|
|
|
|
echo "Provisioning $node at $ip..."
|
|
|
|
nix run github:nix-community/nixos-anywhere -- \
|
|
--flake "$FLAKE_ROOT#$node" \
|
|
--build-on-remote \
|
|
root@$ip &
|
|
done
|
|
|
|
wait
|
|
echo "All nodes provisioned successfully!"
|
|
```
|
|
|
|
Run provisioning:
|
|
```bash
|
|
chmod +x /srv/provisioning/scripts/provision-bootstrap-nodes.sh
|
|
./provision-bootstrap-nodes.sh
|
|
```
|
|
|
|
**Expected output per node:**
|
|
```
|
|
Provisioning node01 at 10.0.100.50...
|
|
Connecting via SSH...
|
|
Running disko to partition disks...
|
|
Building NixOS system...
|
|
Installing bootloader...
|
|
Copying secrets...
|
|
Installation complete. Rebooting...
|
|
```
|
|
|
|
**Step 4.4: Wait for First-Boot Automation**
|
|
|
|
After reboot, nodes will boot from disk and run first-boot automation. Monitor progress:
|
|
|
|
```bash
|
|
# Watch logs on node01 (via SSH after it reboots)
|
|
ssh root@10.0.200.10 # Note: now on production network
|
|
|
|
# Check cluster join services
|
|
journalctl -u chainfire-cluster-join.service -f
|
|
journalctl -u flaredb-cluster-join.service -f
|
|
|
|
# Expected log output:
|
|
# {"level":"INFO","message":"Waiting for local chainfire service..."}
|
|
# {"level":"INFO","message":"Local chainfire healthy"}
|
|
# {"level":"INFO","message":"Bootstrap node, cluster initialized"}
|
|
# {"level":"INFO","message":"Cluster join complete"}
|
|
```
|
|
|
|
**Step 4.5: Verify Cluster Health**
|
|
|
|
```bash
|
|
# Check Chainfire cluster
|
|
curl -k https://node01.example.com:2379/admin/cluster/members | jq
|
|
|
|
# Expected output:
|
|
# {
|
|
# "members": [
|
|
# {"id":"node01","raft_addr":"10.0.200.10:2380","status":"healthy","role":"leader"},
|
|
# {"id":"node02","raft_addr":"10.0.200.11:2380","status":"healthy","role":"follower"},
|
|
# {"id":"node03","raft_addr":"10.0.200.12:2380","status":"healthy","role":"follower"}
|
|
# ]
|
|
# }
|
|
|
|
# Check FlareDB cluster
|
|
curl -k https://node01.example.com:2479/admin/cluster/members | jq
|
|
|
|
# Check IAM service
|
|
curl -k https://node01.example.com:8080/health | jq
|
|
# Expected: {"status":"healthy","database":"connected"}
|
|
```
|
|
|
|
### 5.5 Phase 5: Add Additional Nodes
|
|
|
|
**Step 5.1: Prepare Join-Mode Configurations**
|
|
|
|
Create configuration for node04 (worker profile):
|
|
```json
|
|
{
|
|
"node_id": "node04",
|
|
"node_role": "worker",
|
|
"bootstrap": false,
|
|
"cluster_name": "prod-cluster",
|
|
"leader_url": "https://node01.example.com:2379",
|
|
"raft_addr": "10.0.200.20:2380"
|
|
}
|
|
```
|
|
|
|
**Step 5.2: Power On and Provision Nodes**
|
|
|
|
```bash
|
|
# Power on node via BMC
|
|
ipmitool -I lanplus -H 10.0.10.54 -U admin -P password chassis bootdev pxe
|
|
ipmitool -I lanplus -H 10.0.10.54 -U admin -P password chassis power on
|
|
|
|
# Wait for PXE boot and SSH ready
|
|
sleep 60
|
|
|
|
# Provision node
|
|
nix run github:nix-community/nixos-anywhere -- \
|
|
--flake /srv/provisioning#node04 \
|
|
--build-on-remote \
|
|
root@10.0.100.60
|
|
```
|
|
|
|
**Step 5.3: Verify Cluster Join via API**
|
|
|
|
```bash
|
|
# Check cluster members (should include node04)
|
|
curl -k https://node01.example.com:2379/admin/cluster/members | jq '.members[] | select(.id=="node04")'
|
|
|
|
# Expected:
|
|
# {"id":"node04","raft_addr":"10.0.200.20:2380","status":"healthy","role":"follower"}
|
|
```
|
|
|
|
**Step 5.4: Validate Replication and Service Distribution**
|
|
|
|
```bash
|
|
# Write test data on leader
|
|
curl -k -X PUT https://node01.example.com:2379/v1/kv/test \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"value":"hello world"}'
|
|
|
|
# Read from follower (should be replicated)
|
|
curl -k https://node02.example.com:2379/v1/kv/test | jq
|
|
|
|
# Expected: {"key":"test","value":"hello world"}
|
|
```
|
|
|
|
## 6. Verification & Validation
|
|
|
|
### 6.1 Health Check Commands for All Services
|
|
|
|
**Chainfire:**
|
|
```bash
|
|
curl -k https://node01.example.com:2379/health | jq
|
|
# Expected: {"status":"healthy","raft":"leader","cluster_size":3}
|
|
|
|
# Check cluster membership
|
|
curl -k https://node01.example.com:2379/admin/cluster/members | jq '.members | length'
|
|
# Expected: 3 (for initial bootstrap)
|
|
```
|
|
|
|
**FlareDB:**
|
|
```bash
|
|
curl -k https://node01.example.com:2479/health | jq
|
|
# Expected: {"status":"healthy","raft":"leader","chainfire":"connected"}
|
|
|
|
# Query test metric
|
|
curl -k https://node01.example.com:2479/v1/query \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"query":"up{job=\"node\"}","time":"now"}'
|
|
```
|
|
|
|
**IAM:**
|
|
```bash
|
|
curl -k https://node01.example.com:8080/health | jq
|
|
# Expected: {"status":"healthy","database":"connected","version":"1.0.0"}
|
|
|
|
# List users (requires authentication)
|
|
curl -k https://node01.example.com:8080/api/users \
|
|
-H "Authorization: Bearer $IAM_TOKEN" | jq
|
|
```
|
|
|
|
**PlasmaVMC:**
|
|
```bash
|
|
curl -k https://node01.example.com:9090/health | jq
|
|
# Expected: {"status":"healthy","vms_running":0}
|
|
|
|
# List VMs
|
|
curl -k https://node01.example.com:9090/api/vms | jq
|
|
```
|
|
|
|
**PrismNET:**
|
|
```bash
|
|
curl -k https://node01.example.com:9091/health | jq
|
|
# Expected: {"status":"healthy","networks":0}
|
|
```
|
|
|
|
**FlashDNS:**
|
|
```bash
|
|
dig @node01.example.com example.com
|
|
# Expected: DNS response with ANSWER section
|
|
|
|
# Health check
|
|
curl -k https://node01.example.com:853/health | jq
|
|
```
|
|
|
|
**FiberLB:**
|
|
```bash
|
|
curl -k https://node01.example.com:9092/health | jq
|
|
# Expected: {"status":"healthy","backends":0}
|
|
```
|
|
|
|
**K8sHost:**
|
|
```bash
|
|
kubectl --kubeconfig=/etc/kubernetes/admin.conf get nodes
|
|
# Expected: Node list including this node
|
|
```
|
|
|
|
### 6.2 Cluster Membership Verification
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
# /srv/provisioning/scripts/verify-cluster.sh
|
|
|
|
echo "Checking Chainfire cluster..."
|
|
curl -k https://node01.example.com:2379/admin/cluster/members | jq '.members[] | {id, status, role}'
|
|
|
|
echo ""
|
|
echo "Checking FlareDB cluster..."
|
|
curl -k https://node01.example.com:2479/admin/cluster/members | jq '.members[] | {id, status, role}'
|
|
|
|
echo ""
|
|
echo "Cluster health summary:"
|
|
echo " Chainfire nodes: $(curl -sk https://node01.example.com:2379/admin/cluster/members | jq '.members | length')"
|
|
echo " FlareDB nodes: $(curl -sk https://node01.example.com:2479/admin/cluster/members | jq '.members | length')"
|
|
echo " Raft leaders: Chainfire=$(curl -sk https://node01.example.com:2379/admin/cluster/members | jq -r '.members[] | select(.role=="leader") | .id'), FlareDB=$(curl -sk https://node01.example.com:2479/admin/cluster/members | jq -r '.members[] | select(.role=="leader") | .id')"
|
|
```
|
|
|
|
### 6.3 Raft Leader Election Check
|
|
|
|
```bash
|
|
# Identify current leader
|
|
LEADER=$(curl -sk https://node01.example.com:2379/admin/cluster/members | jq -r '.members[] | select(.role=="leader") | .id')
|
|
echo "Current Chainfire leader: $LEADER"
|
|
|
|
# Verify all followers can reach leader
|
|
for node in node01 node02 node03; do
|
|
echo "Checking $node..."
|
|
curl -sk https://$node.example.com:2379/admin/cluster/leader | jq
|
|
done
|
|
```
|
|
|
|
### 6.4 TLS Certificate Validation
|
|
|
|
```bash
|
|
# Check certificate expiry
|
|
for node in node01 node02 node03; do
|
|
echo "Checking $node certificate..."
|
|
echo | openssl s_client -connect $node.example.com:2379 2>/dev/null | openssl x509 -noout -dates
|
|
done
|
|
|
|
# Verify certificate chain
|
|
echo | openssl s_client -connect node01.example.com:2379 -CAfile /srv/provisioning/ca-cert.pem -verify 1
|
|
# Expected: Verify return code: 0 (ok)
|
|
```
|
|
|
|
### 6.5 Network Connectivity Tests
|
|
|
|
```bash
|
|
# Test inter-node connectivity (from node01)
|
|
ssh root@node01.example.com '
|
|
for node in node02 node03; do
|
|
echo "Testing connectivity to $node..."
|
|
nc -zv $node.example.com 2379
|
|
nc -zv $node.example.com 2380
|
|
done
|
|
'
|
|
|
|
# Test bandwidth (iperf3)
|
|
ssh root@node02.example.com 'iperf3 -s' &
|
|
ssh root@node01.example.com 'iperf3 -c node02.example.com -t 10'
|
|
# Expected: ~10 Gbps on 10GbE, ~1 Gbps on 1GbE
|
|
```
|
|
|
|
### 6.6 Performance Smoke Tests
|
|
|
|
**Chainfire Write Performance:**
|
|
```bash
|
|
# 1000 writes
|
|
time for i in {1..1000}; do
|
|
curl -sk -X PUT https://node01.example.com:2379/v1/kv/test$i \
|
|
-H "Content-Type: application/json" \
|
|
-d "{\"value\":\"test data $i\"}" > /dev/null
|
|
done
|
|
|
|
# Expected: <10 seconds on healthy cluster
|
|
```
|
|
|
|
**FlareDB Query Performance:**
|
|
```bash
|
|
# Insert test metrics
|
|
curl -k -X POST https://node01.example.com:2479/v1/write \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"metric":"test_metric","value":42,"timestamp":"'$(date -Iseconds)'"}'
|
|
|
|
# Query performance
|
|
time curl -k https://node01.example.com:2479/v1/query \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"query":"test_metric","start":"1h","end":"now"}'
|
|
|
|
# Expected: <100ms response time
|
|
```
|
|
|
|
## 7. Common Operations
|
|
|
|
### 7.1 Adding a New Node
|
|
|
|
**Step 1: Prepare Node Configuration**
|
|
```bash
|
|
# Create node directory
|
|
mkdir -p /srv/provisioning/nodes/node05.example.com/secrets
|
|
|
|
# Copy template configuration
|
|
cp /srv/provisioning/nodes/node01.example.com/configuration.nix \
|
|
/srv/provisioning/nodes/node05.example.com/
|
|
|
|
# Edit for new node
|
|
vim /srv/provisioning/nodes/node05.example.com/configuration.nix
|
|
# Update: hostName, ipAddresses, node_id
|
|
```
|
|
|
|
**Step 2: Generate Cluster Config (Join Mode)**
|
|
```json
|
|
{
|
|
"node_id": "node05",
|
|
"node_role": "worker",
|
|
"bootstrap": false,
|
|
"cluster_name": "prod-cluster",
|
|
"leader_url": "https://node01.example.com:2379",
|
|
"raft_addr": "10.0.200.21:2380"
|
|
}
|
|
```
|
|
|
|
**Step 3: Provision Node**
|
|
```bash
|
|
# Power on and PXE boot
|
|
ipmitool -I lanplus -H 10.0.10.55 -U admin -P password chassis bootdev pxe
|
|
ipmitool -I lanplus -H 10.0.10.55 -U admin -P password chassis power on
|
|
|
|
# Wait for SSH
|
|
sleep 60
|
|
|
|
# Run nixos-anywhere
|
|
nix run github:nix-community/nixos-anywhere -- \
|
|
--flake /srv/provisioning#node05 \
|
|
root@10.0.100.65
|
|
```
|
|
|
|
**Step 4: Verify Join**
|
|
```bash
|
|
# Check cluster membership
|
|
curl -k https://node01.example.com:2379/admin/cluster/members | jq '.members[] | select(.id=="node05")'
|
|
```
|
|
|
|
### 7.2 Replacing a Failed Node
|
|
|
|
**Step 1: Remove Failed Node from Cluster**
|
|
```bash
|
|
# Remove from Chainfire cluster
|
|
curl -k -X DELETE https://node01.example.com:2379/admin/member/node02
|
|
|
|
# Remove from FlareDB cluster
|
|
curl -k -X DELETE https://node01.example.com:2479/admin/member/node02
|
|
```
|
|
|
|
**Step 2: Physically Replace Hardware**
|
|
- Power off old node
|
|
- Remove from rack
|
|
- Install new node
|
|
- Connect all cables
|
|
- Configure BMC
|
|
|
|
**Step 3: Provision Replacement Node**
|
|
```bash
|
|
# Use same node ID and configuration
|
|
nix run github:nix-community/nixos-anywhere -- \
|
|
--flake /srv/provisioning#node02 \
|
|
root@10.0.100.51
|
|
```
|
|
|
|
**Step 4: Verify Rejoin**
|
|
```bash
|
|
# Cluster should automatically add node during first-boot
|
|
curl -k https://node01.example.com:2379/admin/cluster/members | jq
|
|
```
|
|
|
|
### 7.3 Updating Node Configuration
|
|
|
|
**Step 1: Edit Configuration**
|
|
```bash
|
|
vim /srv/provisioning/nodes/node01.example.com/configuration.nix
|
|
# Make changes (e.g., add service, change network config)
|
|
```
|
|
|
|
**Step 2: Build and Deploy**
|
|
```bash
|
|
# Build configuration locally
|
|
nix build /srv/provisioning#node01
|
|
|
|
# Deploy to node (from node or remote)
|
|
nixos-rebuild switch --flake /srv/provisioning#node01
|
|
```
|
|
|
|
**Step 3: Verify Changes**
|
|
```bash
|
|
# Check active configuration
|
|
ssh root@node01.example.com 'nixos-rebuild list-generations'
|
|
|
|
# Test services still healthy
|
|
curl -k https://node01.example.com:2379/health | jq
|
|
```
|
|
|
|
### 7.4 Rolling Updates
|
|
|
|
**Update Process (One Node at a Time):**
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
# /srv/provisioning/scripts/rolling-update.sh
|
|
|
|
NODES=("node01" "node02" "node03")
|
|
|
|
for node in "${NODES[@]}"; do
|
|
echo "Updating $node..."
|
|
|
|
# Build new configuration
|
|
nix build /srv/provisioning#$node
|
|
|
|
# Deploy (test mode first)
|
|
ssh root@$node.example.com "nixos-rebuild test --flake /srv/provisioning#$node"
|
|
|
|
# Verify health
|
|
if ! curl -k https://$node.example.com:2379/health | jq -e '.status == "healthy"'; then
|
|
echo "ERROR: $node unhealthy after test, aborting"
|
|
ssh root@$node.example.com "nixos-rebuild switch --rollback"
|
|
exit 1
|
|
fi
|
|
|
|
# Apply permanently
|
|
ssh root@$node.example.com "nixos-rebuild switch --flake /srv/provisioning#$node"
|
|
|
|
# Wait for reboot if kernel changed
|
|
echo "Waiting 30s for stabilization..."
|
|
sleep 30
|
|
|
|
# Final health check
|
|
curl -k https://$node.example.com:2379/health | jq
|
|
|
|
echo "$node updated successfully"
|
|
done
|
|
```
|
|
|
|
### 7.5 Draining a Node for Maintenance
|
|
|
|
**Step 1: Mark Node for Drain**
|
|
```bash
|
|
# Disable node in load balancer (if using one)
|
|
curl -k -X POST https://node01.example.com:9092/api/backend/node02 \
|
|
-d '{"status":"drain"}'
|
|
```
|
|
|
|
**Step 2: Migrate VMs (PlasmaVMC)**
|
|
```bash
|
|
# List VMs on node
|
|
ssh root@node02.example.com 'systemctl list-units | grep plasmavmc-vm@'
|
|
|
|
# Migrate each VM
|
|
curl -k -X POST https://node01.example.com:9090/api/vms/vm-001/migrate \
|
|
-d '{"target_node":"node03"}'
|
|
```
|
|
|
|
**Step 3: Stop Services**
|
|
```bash
|
|
ssh root@node02.example.com '
|
|
systemctl stop plasmavmc.service
|
|
systemctl stop chainfire.service
|
|
systemctl stop flaredb.service
|
|
'
|
|
```
|
|
|
|
**Step 4: Perform Maintenance**
|
|
```bash
|
|
# Reboot for kernel update, hardware maintenance, etc.
|
|
ssh root@node02.example.com 'reboot'
|
|
```
|
|
|
|
**Step 5: Re-enable Node**
|
|
```bash
|
|
# Verify all services healthy
|
|
ssh root@node02.example.com 'systemctl status chainfire flaredb plasmavmc'
|
|
|
|
# Re-enable in load balancer
|
|
curl -k -X POST https://node01.example.com:9092/api/backend/node02 \
|
|
-d '{"status":"active"}'
|
|
```
|
|
|
|
### 7.6 Decommissioning a Node
|
|
|
|
**Step 1: Drain Node (see 7.5)**
|
|
|
|
**Step 2: Remove from Cluster**
|
|
```bash
|
|
# Remove from Chainfire
|
|
curl -k -X DELETE https://node01.example.com:2379/admin/member/node02
|
|
|
|
# Remove from FlareDB
|
|
curl -k -X DELETE https://node01.example.com:2479/admin/member/node02
|
|
|
|
# Verify removal
|
|
curl -k https://node01.example.com:2379/admin/cluster/members | jq
|
|
```
|
|
|
|
**Step 3: Power Off**
|
|
```bash
|
|
# Via BMC
|
|
ipmitool -I lanplus -H 10.0.10.51 -U admin -P password chassis power off
|
|
|
|
# Or via SSH
|
|
ssh root@node02.example.com 'poweroff'
|
|
```
|
|
|
|
**Step 4: Update Inventory**
|
|
```bash
|
|
# Remove from node inventory
|
|
vim /srv/provisioning/inventory.json
|
|
# Remove node02 entry
|
|
|
|
# Remove from DNS
|
|
# Update DNS zone to remove node02.example.com
|
|
|
|
# Remove from monitoring
|
|
# Update Prometheus targets to remove node02
|
|
```
|
|
|
|
## 8. Troubleshooting
|
|
|
|
### 8.1 PXE Boot Failures
|
|
|
|
**Symptom:** Server does not obtain IP address or does not boot from network
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Monitor DHCP server logs
|
|
sudo journalctl -u dhcpd4 -f
|
|
|
|
# Monitor TFTP requests
|
|
sudo tcpdump -i eth0 -n port 69
|
|
|
|
# Check PXE server services
|
|
sudo systemctl status dhcpd4 atftpd nginx
|
|
```
|
|
|
|
**Common Causes:**
|
|
1. **DHCP server not running:** `sudo systemctl start dhcpd4`
|
|
2. **Wrong network interface:** Check `interfaces` in dhcpd.conf
|
|
3. **Firewall blocking DHCP/TFTP:** `sudo iptables -L -n | grep -E "67|68|69"`
|
|
4. **PXE not enabled in BIOS:** Enter BIOS and enable Network Boot
|
|
5. **Network cable disconnected:** Check physical connection
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Restart all PXE services
|
|
sudo systemctl restart dhcpd4 atftpd nginx
|
|
|
|
# Verify DHCP configuration
|
|
sudo dhcpd -t -cf /etc/dhcp/dhcpd.conf
|
|
|
|
# Test TFTP
|
|
tftp localhost -c get undionly.kpxe /tmp/test.kpxe
|
|
|
|
# Power cycle server
|
|
ipmitool -I lanplus -H <bmc-ip> -U admin chassis power cycle
|
|
```
|
|
|
|
### 8.2 Installation Failures (nixos-anywhere)
|
|
|
|
**Symptom:** nixos-anywhere fails during disk partitioning, installation, or bootloader setup
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Check nixos-anywhere output for errors
|
|
# Common errors: disk not found, partition table errors, out of space
|
|
|
|
# SSH to installer for manual inspection
|
|
ssh root@10.0.100.50
|
|
|
|
# Check disk status
|
|
lsblk
|
|
dmesg | grep -i error
|
|
```
|
|
|
|
**Common Causes:**
|
|
1. **Disk device wrong:** Update disko.nix with correct device (e.g., /dev/nvme0n1)
|
|
2. **Disk not wiped:** Previous partition table conflicts
|
|
3. **Out of disk space:** Insufficient storage for Nix closures
|
|
4. **Network issues:** Cannot download packages from binary cache
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Manual disk wipe (on installer)
|
|
ssh root@10.0.100.50 '
|
|
wipefs -a /dev/sda
|
|
sgdisk --zap-all /dev/sda
|
|
'
|
|
|
|
# Retry nixos-anywhere
|
|
nix run github:nix-community/nixos-anywhere -- \
|
|
--flake /srv/provisioning#node01 \
|
|
--debug \
|
|
root@10.0.100.50
|
|
```
|
|
|
|
### 8.3 Cluster Join Failures
|
|
|
|
**Symptom:** Node boots successfully but does not join cluster
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Check first-boot logs on node
|
|
ssh root@node01.example.com 'journalctl -u chainfire-cluster-join.service -u flaredb-cluster-join.service'
|
|
|
|
# Common errors:
|
|
# - "Health check timeout after 120s"
|
|
# - "Join request failed: connection refused"
|
|
# - "Configuration file not found"
|
|
```
|
|
|
|
**Bootstrap Mode vs Join Mode:**
|
|
- **Bootstrap:** Node expects to create new cluster with peers
|
|
- **Join:** Node expects to connect to existing leader
|
|
|
|
**Common Causes:**
|
|
1. **Wrong bootstrap flag:** Check cluster-config.json
|
|
2. **Leader unreachable:** Network/firewall issue
|
|
3. **TLS certificate errors:** Verify cert paths and validity
|
|
4. **Service not starting:** Check main service (chainfire.service)
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Verify cluster-config.json
|
|
ssh root@node01.example.com 'cat /etc/nixos/secrets/cluster-config.json | jq'
|
|
|
|
# Test leader connectivity
|
|
ssh root@node04.example.com 'curl -k https://node01.example.com:2379/health'
|
|
|
|
# Check TLS certificates
|
|
ssh root@node04.example.com 'ls -l /etc/nixos/secrets/*.pem'
|
|
|
|
# Manual cluster join (if automation fails)
|
|
curl -k -X POST https://node01.example.com:2379/admin/member/add \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"id":"node04","raft_addr":"10.0.200.20:2380"}'
|
|
```
|
|
|
|
### 8.4 Service Start Failures
|
|
|
|
**Symptom:** Service fails to start after boot
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Check service status
|
|
ssh root@node01.example.com 'systemctl status chainfire.service'
|
|
|
|
# View logs
|
|
ssh root@node01.example.com 'journalctl -u chainfire.service -n 100'
|
|
|
|
# Common errors:
|
|
# - "bind: address already in use" (port conflict)
|
|
# - "certificate verify failed" (TLS issue)
|
|
# - "permission denied" (file permissions)
|
|
```
|
|
|
|
**Common Causes:**
|
|
1. **Port already in use:** Another service using same port
|
|
2. **Missing dependencies:** Required service not running
|
|
3. **Configuration error:** Invalid config file
|
|
4. **File permissions:** Cannot read secrets
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Check port usage
|
|
ssh root@node01.example.com 'ss -tlnp | grep 2379'
|
|
|
|
# Verify dependencies
|
|
ssh root@node01.example.com 'systemctl list-dependencies chainfire.service'
|
|
|
|
# Test configuration manually
|
|
ssh root@node01.example.com 'chainfire-server --config /etc/nixos/chainfire.toml --check-config'
|
|
|
|
# Fix permissions
|
|
ssh root@node01.example.com 'chmod 600 /etc/nixos/secrets/*-key.pem'
|
|
```
|
|
|
|
### 8.5 Network Connectivity Issues
|
|
|
|
**Symptom:** Nodes cannot communicate with each other or external services
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Test basic connectivity
|
|
ssh root@node01.example.com 'ping -c 3 node02.example.com'
|
|
|
|
# Test specific ports
|
|
ssh root@node01.example.com 'nc -zv node02.example.com 2379'
|
|
|
|
# Check firewall rules
|
|
ssh root@node01.example.com 'iptables -L -n | grep 2379'
|
|
|
|
# Check routing
|
|
ssh root@node01.example.com 'ip route show'
|
|
```
|
|
|
|
**Common Causes:**
|
|
1. **Firewall blocking traffic:** Missing iptables rules
|
|
2. **Wrong IP address:** Configuration mismatch
|
|
3. **Network interface down:** Interface not configured
|
|
4. **DNS resolution failure:** Cannot resolve hostnames
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Add firewall rules
|
|
ssh root@node01.example.com '
|
|
iptables -A INPUT -p tcp --dport 2379 -s 10.0.200.0/24 -j ACCEPT
|
|
iptables -A INPUT -p tcp --dport 2380 -s 10.0.200.0/24 -j ACCEPT
|
|
iptables-save > /etc/iptables/rules.v4
|
|
'
|
|
|
|
# Fix DNS resolution
|
|
ssh root@node01.example.com '
|
|
echo "10.0.200.11 node02.example.com node02" >> /etc/hosts
|
|
'
|
|
|
|
# Restart networking
|
|
ssh root@node01.example.com 'systemctl restart systemd-networkd'
|
|
```
|
|
|
|
### 8.6 TLS Certificate Errors
|
|
|
|
**Symptom:** Services cannot establish TLS connections
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Test TLS connection
|
|
openssl s_client -connect node01.example.com:2379 -CAfile /srv/provisioning/ca-cert.pem
|
|
|
|
# Check certificate validity
|
|
ssh root@node01.example.com '
|
|
openssl x509 -in /etc/nixos/secrets/node01-cert.pem -noout -dates
|
|
'
|
|
|
|
# Common errors:
|
|
# - "certificate verify failed" (wrong CA)
|
|
# - "certificate has expired" (cert expired)
|
|
# - "certificate subject name mismatch" (wrong CN)
|
|
```
|
|
|
|
**Common Causes:**
|
|
1. **Expired certificate:** Regenerate certificate
|
|
2. **Wrong CA certificate:** Verify CA cert is correct
|
|
3. **Hostname mismatch:** CN does not match hostname
|
|
4. **File permissions:** Cannot read certificate files
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Regenerate certificate
|
|
openssl req -new -key /srv/provisioning/secrets/node01-key.pem \
|
|
-out /srv/provisioning/secrets/node01-csr.pem \
|
|
-subj "/CN=node01.example.com"
|
|
|
|
openssl x509 -req -in /srv/provisioning/secrets/node01-csr.pem \
|
|
-CA /srv/provisioning/ca-cert.pem \
|
|
-CAkey /srv/provisioning/ca-key.pem \
|
|
-CAcreateserial \
|
|
-out /srv/provisioning/secrets/node01-cert.pem \
|
|
-days 365
|
|
|
|
# Copy to node
|
|
scp /srv/provisioning/secrets/node01-cert.pem root@node01.example.com:/etc/nixos/secrets/
|
|
|
|
# Restart service
|
|
ssh root@node01.example.com 'systemctl restart chainfire.service'
|
|
```
|
|
|
|
### 8.7 Performance Degradation
|
|
|
|
**Symptom:** Services are slow or unresponsive
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Check system load
|
|
ssh root@node01.example.com 'uptime'
|
|
ssh root@node01.example.com 'top -bn1 | head -20'
|
|
|
|
# Check disk I/O
|
|
ssh root@node01.example.com 'iostat -x 1 5'
|
|
|
|
# Check network bandwidth
|
|
ssh root@node01.example.com 'iftop -i eth1'
|
|
|
|
# Check Raft logs for slow operations
|
|
ssh root@node01.example.com 'journalctl -u chainfire.service | grep "slow operation"'
|
|
```
|
|
|
|
**Common Causes:**
|
|
1. **High CPU usage:** Too many requests, inefficient queries
|
|
2. **Disk I/O bottleneck:** Slow disk, too many writes
|
|
3. **Network saturation:** Bandwidth exhausted
|
|
4. **Memory pressure:** OOM killer active
|
|
5. **Raft slow commits:** Network latency between nodes
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Add more resources (vertical scaling)
|
|
# Or add more nodes (horizontal scaling)
|
|
|
|
# Check for resource leaks
|
|
ssh root@node01.example.com 'systemctl status chainfire | grep Memory'
|
|
|
|
# Restart service to clear memory leaks (temporary)
|
|
ssh root@node01.example.com 'systemctl restart chainfire.service'
|
|
|
|
# Optimize disk I/O (enable write caching if safe)
|
|
ssh root@node01.example.com 'hdparm -W1 /dev/sda'
|
|
```
|
|
|
|
## 9. Rollback & Recovery
|
|
|
|
### 9.1 NixOS Generation Rollback
|
|
|
|
NixOS provides atomic rollback capability via generations:
|
|
|
|
**List Available Generations:**
|
|
```bash
|
|
ssh root@node01.example.com 'nixos-rebuild list-generations'
|
|
# Example output:
|
|
# 1 2025-12-10 10:30:00
|
|
# 2 2025-12-10 12:45:00 (current)
|
|
```
|
|
|
|
**Rollback to Previous Generation:**
|
|
```bash
|
|
# Rollback and reboot
|
|
ssh root@node01.example.com 'nixos-rebuild switch --rollback'
|
|
|
|
# Or boot into previous generation once (no permanent change)
|
|
ssh root@node01.example.com 'nixos-rebuild boot --rollback && reboot'
|
|
```
|
|
|
|
**Rollback to Specific Generation:**
|
|
```bash
|
|
ssh root@node01.example.com 'nix-env --switch-generation 1 -p /nix/var/nix/profiles/system'
|
|
ssh root@node01.example.com 'reboot'
|
|
```
|
|
|
|
### 9.2 Re-Provisioning from PXE
|
|
|
|
Complete re-provisioning wipes all data and reinstalls from scratch:
|
|
|
|
**Step 1: Remove Node from Cluster**
|
|
```bash
|
|
curl -k -X DELETE https://node01.example.com:2379/admin/member/node02
|
|
curl -k -X DELETE https://node01.example.com:2479/admin/member/node02
|
|
```
|
|
|
|
**Step 2: Set Boot to PXE**
|
|
```bash
|
|
ipmitool -I lanplus -H 10.0.10.51 -U admin chassis bootdev pxe
|
|
```
|
|
|
|
**Step 3: Reboot Node**
|
|
```bash
|
|
ssh root@node02.example.com 'reboot'
|
|
# Or via BMC
|
|
ipmitool -I lanplus -H 10.0.10.51 -U admin chassis power cycle
|
|
```
|
|
|
|
**Step 4: Run nixos-anywhere**
|
|
```bash
|
|
# Wait for PXE boot and SSH ready
|
|
sleep 90
|
|
|
|
nix run github:nix-community/nixos-anywhere -- \
|
|
--flake /srv/provisioning#node02 \
|
|
root@10.0.100.51
|
|
```
|
|
|
|
### 9.3 Disaster Recovery Procedures
|
|
|
|
**Complete Cluster Loss (All Nodes Down):**
|
|
|
|
**Step 1: Restore from Backup (if available)**
|
|
```bash
|
|
# Restore Chainfire data
|
|
ssh root@node01.example.com '
|
|
systemctl stop chainfire.service
|
|
rm -rf /var/lib/chainfire/*
|
|
tar -xzf /backup/chainfire-$(date +%Y%m%d).tar.gz -C /var/lib/chainfire/
|
|
systemctl start chainfire.service
|
|
'
|
|
```
|
|
|
|
**Step 2: Bootstrap New Cluster**
|
|
If no backup, re-provision all nodes as bootstrap:
|
|
```bash
|
|
# Update cluster-config.json for all nodes
|
|
# Set bootstrap=true, same initial_peers
|
|
|
|
# Provision all 3 nodes
|
|
for node in node01 node02 node03; do
|
|
nix run github:nix-community/nixos-anywhere -- \
|
|
--flake /srv/provisioning#$node \
|
|
root@<node-ip> &
|
|
done
|
|
wait
|
|
```
|
|
|
|
**Single Node Failure:**
|
|
|
|
**Step 1: Verify Cluster Quorum**
|
|
```bash
|
|
# Check remaining nodes have quorum
|
|
curl -k https://node01.example.com:2379/admin/cluster/members | jq '.members | length'
|
|
# Expected: 2 (if 3-node cluster with 1 failure)
|
|
```
|
|
|
|
**Step 2: Remove Failed Node**
|
|
```bash
|
|
curl -k -X DELETE https://node01.example.com:2379/admin/member/node02
|
|
```
|
|
|
|
**Step 3: Provision Replacement**
|
|
```bash
|
|
# Use same node ID and configuration
|
|
nix run github:nix-community/nixos-anywhere -- \
|
|
--flake /srv/provisioning#node02 \
|
|
root@10.0.100.51
|
|
```
|
|
|
|
### 9.4 Backup and Restore
|
|
|
|
**Automated Backup Script:**
|
|
```bash
|
|
#!/bin/bash
|
|
# /srv/provisioning/scripts/backup-cluster.sh
|
|
|
|
BACKUP_DIR="/backup/cluster-$(date +%Y%m%d-%H%M%S)"
|
|
mkdir -p "$BACKUP_DIR"
|
|
|
|
# Backup Chainfire data
|
|
for node in node01 node02 node03; do
|
|
ssh root@$node.example.com \
|
|
"tar -czf - /var/lib/chainfire" > "$BACKUP_DIR/chainfire-$node.tar.gz"
|
|
done
|
|
|
|
# Backup FlareDB data
|
|
for node in node01 node02 node03; do
|
|
ssh root@$node.example.com \
|
|
"tar -czf - /var/lib/flaredb" > "$BACKUP_DIR/flaredb-$node.tar.gz"
|
|
done
|
|
|
|
# Backup configurations
|
|
cp -r /srv/provisioning/nodes "$BACKUP_DIR/configs"
|
|
|
|
echo "Backup complete: $BACKUP_DIR"
|
|
```
|
|
|
|
**Restore Script:**
|
|
```bash
|
|
#!/bin/bash
|
|
# /srv/provisioning/scripts/restore-cluster.sh
|
|
|
|
BACKUP_DIR="$1"
|
|
if [ -z "$BACKUP_DIR" ]; then
|
|
echo "Usage: $0 <backup-dir>"
|
|
exit 1
|
|
fi
|
|
|
|
# Stop services on all nodes
|
|
for node in node01 node02 node03; do
|
|
ssh root@$node.example.com 'systemctl stop chainfire flaredb'
|
|
done
|
|
|
|
# Restore Chainfire data
|
|
for node in node01 node02 node03; do
|
|
cat "$BACKUP_DIR/chainfire-$node.tar.gz" | \
|
|
ssh root@$node.example.com "cd / && tar -xzf -"
|
|
done
|
|
|
|
# Restore FlareDB data
|
|
for node in node01 node02 node03; do
|
|
cat "$BACKUP_DIR/flaredb-$node.tar.gz" | \
|
|
ssh root@$node.example.com "cd / && tar -xzf -"
|
|
done
|
|
|
|
# Restart services
|
|
for node in node01 node02 node03; do
|
|
ssh root@$node.example.com 'systemctl start chainfire flaredb'
|
|
done
|
|
|
|
echo "Restore complete"
|
|
```
|
|
|
|
## 10. Security Best Practices
|
|
|
|
### 10.1 SSH Key Management
|
|
|
|
**Generate Dedicated Provisioning Key:**
|
|
```bash
|
|
ssh-keygen -t ed25519 -C "provisioning@example.com" -f ~/.ssh/id_ed25519_provisioning
|
|
```
|
|
|
|
**Add to Netboot Image:**
|
|
```nix
|
|
# In netboot-base.nix
|
|
users.users.root.openssh.authorizedKeys.keys = [
|
|
"ssh-ed25519 AAAAC3Nza... provisioning@example.com"
|
|
];
|
|
```
|
|
|
|
**Rotate Keys Regularly:**
|
|
```bash
|
|
# Generate new key
|
|
ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519_provisioning_new
|
|
|
|
# Add to all nodes
|
|
for node in node01 node02 node03; do
|
|
ssh-copy-id -i ~/.ssh/id_ed25519_provisioning_new.pub root@$node.example.com
|
|
done
|
|
|
|
# Remove old key from authorized_keys
|
|
# Update netboot image with new key
|
|
```
|
|
|
|
### 10.2 TLS Certificate Rotation
|
|
|
|
**Automated Rotation Script:**
|
|
```bash
|
|
#!/bin/bash
|
|
# /srv/provisioning/scripts/rotate-certs.sh
|
|
|
|
# Generate new certificates
|
|
for node in node01 node02 node03; do
|
|
openssl genrsa -out ${node}-key-new.pem 4096
|
|
openssl req -new -key ${node}-key-new.pem -out ${node}-csr.pem \
|
|
-subj "/CN=${node}.example.com"
|
|
openssl x509 -req -in ${node}-csr.pem \
|
|
-CA ca-cert.pem -CAkey ca-key.pem \
|
|
-CAcreateserial -out ${node}-cert-new.pem -days 365
|
|
done
|
|
|
|
# Deploy new certificates (without restarting services yet)
|
|
for node in node01 node02 node03; do
|
|
scp ${node}-cert-new.pem root@${node}.example.com:/etc/nixos/secrets/${node}-cert-new.pem
|
|
scp ${node}-key-new.pem root@${node}.example.com:/etc/nixos/secrets/${node}-key-new.pem
|
|
done
|
|
|
|
# Update configuration to use new certs
|
|
# ... (NixOS configuration update) ...
|
|
|
|
# Rolling restart to apply new certificates
|
|
for node in node01 node02 node03; do
|
|
ssh root@${node}.example.com 'systemctl restart chainfire flaredb iam'
|
|
sleep 30 # Wait for stabilization
|
|
done
|
|
|
|
echo "Certificate rotation complete"
|
|
```
|
|
|
|
### 10.3 Secrets Management
|
|
|
|
**Best Practices:**
|
|
- Store secrets outside Nix store (use `/etc/nixos/secrets/`)
|
|
- Set restrictive permissions (0600 for private keys, 0400 for passwords)
|
|
- Use environment variables for runtime secrets
|
|
- Never commit secrets to Git
|
|
- Use encrypted secrets (sops-nix or agenix)
|
|
|
|
**Example with sops-nix:**
|
|
```nix
|
|
# In configuration.nix
|
|
{
|
|
imports = [ <sops-nix/modules/sops> ];
|
|
|
|
sops.defaultSopsFile = ./secrets.yaml;
|
|
sops.secrets."node01/tls-key" = {
|
|
owner = "chainfire";
|
|
mode = "0400";
|
|
};
|
|
|
|
services.chainfire.settings.tls.key_path = config.sops.secrets."node01/tls-key".path;
|
|
}
|
|
```
|
|
|
|
### 10.4 Network Isolation
|
|
|
|
**VLAN Segmentation:**
|
|
- Management VLAN (10): BMC/IPMI, provisioning workstation
|
|
- Provisioning VLAN (100): PXE boot, temporary
|
|
- Production VLAN (200): Cluster services, inter-node communication
|
|
- Client VLAN (300): External clients accessing services
|
|
|
|
**Firewall Zones:**
|
|
```bash
|
|
# Example nftables rules
|
|
table inet filter {
|
|
chain input {
|
|
type filter hook input priority 0; policy drop;
|
|
|
|
# Management from trusted subnet only
|
|
iifname "eth0" ip saddr 10.0.10.0/24 tcp dport 22 accept
|
|
|
|
# Cluster traffic from cluster subnet only
|
|
iifname "eth1" ip saddr 10.0.200.0/24 tcp dport { 2379, 2380, 2479, 2480 } accept
|
|
|
|
# Client traffic from client subnet only
|
|
iifname "eth2" ip saddr 10.0.300.0/24 tcp dport { 8080, 9090 } accept
|
|
}
|
|
}
|
|
```
|
|
|
|
### 10.5 Audit Logging
|
|
|
|
**Enable Structured Logging:**
|
|
```nix
|
|
# In configuration.nix
|
|
services.chainfire.settings.logging = {
|
|
level = "info";
|
|
format = "json";
|
|
output = "journal";
|
|
};
|
|
|
|
# Enable journald forwarding to SIEM
|
|
services.journald.extraConfig = ''
|
|
ForwardToSyslog=yes
|
|
Storage=persistent
|
|
MaxRetentionSec=7days
|
|
'';
|
|
```
|
|
|
|
**Audit Key Events:**
|
|
- Cluster membership changes
|
|
- Node joins/leaves
|
|
- Authentication failures
|
|
- Configuration changes
|
|
- TLS certificate errors
|
|
|
|
**Log Aggregation:**
|
|
```bash
|
|
# Forward logs to central logging server
|
|
# Example: rsyslog configuration
|
|
cat > /etc/rsyslog.d/50-remote.conf <<EOF
|
|
*.* @@logging-server.example.com:514
|
|
EOF
|
|
systemctl restart rsyslog
|
|
```
|
|
|
|
---
|
|
|
|
## Appendix A: Service Port Reference
|
|
|
|
See [NETWORK.md](NETWORK.md) for complete port matrix.
|
|
|
|
## Appendix B: Hardware Vendor Commands
|
|
|
|
See [HARDWARE.md](HARDWARE.md) for vendor-specific BIOS configurations and IPMI commands.
|
|
|
|
## Appendix C: Complete Command Reference
|
|
|
|
See [COMMANDS.md](COMMANDS.md) for all commands organized by task.
|
|
|
|
## Appendix D: Quick Reference Cards
|
|
|
|
See [QUICKSTART.md](QUICKSTART.md) for condensed deployment guide.
|
|
|
|
## Appendix E: Deployment Flow Diagrams
|
|
|
|
See [diagrams/deployment-flow.md](diagrams/deployment-flow.md) for visual workflow.
|
|
|
|
## Appendix F: Related Documentation
|
|
|
|
- **Design Document:** `/home/centra/cloud/docs/por/T032-baremetal-provisioning/design.md`
|
|
- **PXE Server:** `/home/centra/cloud/chainfire/baremetal/pxe-server/README.md`
|
|
- **Image Builder:** `/home/centra/cloud/baremetal/image-builder/README.md`
|
|
- **First-Boot Automation:** `/home/centra/cloud/baremetal/first-boot/README.md`
|
|
|
|
---
|
|
|
|
**End of Operator Runbook**
|