photoncloud-monorepo/chainfire/baremetal/pxe-server/README.md
centra 5c6eb04a46 T036: Add VM cluster deployment configs for nixos-anywhere
- netboot-base.nix with SSH key auth
- Launch scripts for node01/02/03
- Node configuration.nix and disko.nix
- Nix modules for first-boot automation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 09:59:19 +09:00

829 lines
22 KiB
Markdown

# Centra Cloud PXE Boot Server
This directory contains the PXE (Preboot eXecution Environment) boot infrastructure for bare-metal provisioning of Centra Cloud nodes. It enables network-based installation of NixOS on physical servers with automated profile selection.
## Table of Contents
- [Architecture Overview](#architecture-overview)
- [Components](#components)
- [Quick Start](#quick-start)
- [Detailed Setup](#detailed-setup)
- [Configuration](#configuration)
- [Boot Profiles](#boot-profiles)
- [Network Requirements](#network-requirements)
- [Troubleshooting](#troubleshooting)
- [Advanced Topics](#advanced-topics)
## Architecture Overview
The PXE boot infrastructure consists of three main services:
```
┌─────────────────────────────────────────────────────────────────┐
│ PXE Boot Flow │
└─────────────────────────────────────────────────────────────────┘
Bare-Metal Server PXE Boot Server
───────────────── ───────────────
1. Power on
├─► DHCP Request ──────────────► DHCP Server
│ (ISC DHCP)
│ │
│ ├─ Assigns IP
│ ├─ Detects BIOS/UEFI
│ └─ Provides bootloader path
├◄─ DHCP Response ───────────────┤
│ (IP, next-server, filename)
├─► TFTP Get bootloader ─────────► TFTP Server
│ (undionly.kpxe or ipxe.efi) (atftpd)
├◄─ Bootloader file ─────────────┤
├─► Execute iPXE bootloader
│ │
│ ├─► HTTP Get boot.ipxe ──────► HTTP Server
│ │ (nginx)
│ │
│ ├◄─ boot.ipxe script ─────────┤
│ │
│ ├─► Display menu / Auto-select profile
│ │
│ ├─► HTTP Get kernel ──────────► HTTP Server
│ │
│ ├◄─ bzImage ───────────────────┤
│ │
│ ├─► HTTP Get initrd ───────────► HTTP Server
│ │
│ ├◄─ initrd ────────────────────┤
│ │
│ └─► Boot NixOS
└─► NixOS Installer
└─ Provisions node based on profile
```
## Components
### 1. DHCP Server (ISC DHCP)
- **Purpose**: Assigns IP addresses and directs PXE clients to bootloader
- **Config**: `dhcp/dhcpd.conf`
- **Features**:
- BIOS/UEFI detection via option 93 (architecture type)
- Per-host configuration for fixed IP assignment
- Automatic next-server and filename configuration
### 2. TFTP Server (atftpd)
- **Purpose**: Serves iPXE bootloader files to PXE clients
- **Files served**:
- `undionly.kpxe` - BIOS bootloader
- `ipxe.efi` - UEFI x86-64 bootloader
- `ipxe-i386.efi` - UEFI x86 32-bit bootloader (optional)
### 3. HTTP Server (nginx)
- **Purpose**: Serves iPXE scripts and NixOS boot images
- **Config**: `http/nginx.conf`
- **Endpoints**:
- `/boot/ipxe/boot.ipxe` - Main boot menu script
- `/boot/nixos/bzImage` - NixOS kernel
- `/boot/nixos/initrd` - NixOS initial ramdisk
- `/health` - Health check endpoint
### 4. iPXE Boot Scripts
- **Main script**: `ipxe/boot.ipxe`
- **Features**:
- Interactive boot menu with 3 profiles
- MAC-based automatic profile selection
- Serial console support for remote management
- Detailed error messages and debugging options
### 5. NixOS Service Module
- **File**: `nixos-module.nix`
- **Purpose**: Declarative NixOS configuration for all services
- **Features**:
- Single configuration file for entire stack
- Firewall rules auto-configured
- Systemd service dependencies managed
- Directory structure auto-created
## Quick Start
### Prerequisites
- NixOS server with network connectivity
- Network interface on the same subnet as bare-metal servers
- Sufficient disk space (5-10 GB for boot images)
### Installation Steps
1. **Clone this repository** (or copy `baremetal/pxe-server/` to your NixOS system)
2. **Run the setup script**:
```bash
sudo ./setup.sh --install --download --validate
```
This will:
- Create directory structure at `/var/lib/pxe-boot`
- Download iPXE bootloaders from boot.ipxe.org
- Install boot scripts
- Validate configurations
3. **Configure network settings**:
Edit `nixos-module.nix` or create a NixOS configuration:
```nix
# /etc/nixos/configuration.nix
imports = [
/path/to/baremetal/pxe-server/nixos-module.nix
];
services.centra-pxe-server = {
enable = true;
interface = "eth0"; # Your network interface
serverAddress = "10.0.100.10"; # PXE server IP
dhcp = {
subnet = "10.0.100.0";
netmask = "255.255.255.0";
broadcast = "10.0.100.255";
range = {
start = "10.0.100.100";
end = "10.0.100.200";
};
router = "10.0.100.1";
};
# Optional: Define known nodes with MAC addresses
nodes = {
"52:54:00:12:34:56" = {
profile = "control-plane";
hostname = "control-plane-01";
ipAddress = "10.0.100.50";
};
};
};
```
4. **Deploy NixOS configuration**:
```bash
sudo nixos-rebuild switch
```
5. **Verify services are running**:
```bash
sudo ./setup.sh --test
```
6. **Add NixOS boot images** (will be provided by T032.S3):
```bash
# Placeholder - actual images will be built by image builder
# For testing, you can use any NixOS netboot image
sudo mkdir -p /var/lib/pxe-boot/nixos
# Copy bzImage and initrd to /var/lib/pxe-boot/nixos/
```
7. **Boot a bare-metal server**:
- Configure server BIOS to boot from network (PXE)
- Connect to same network segment
- Power on server
- Watch for DHCP discovery and iPXE boot menu
## Detailed Setup
### Option 1: NixOS Module (Recommended)
The NixOS module provides a declarative way to configure the entire PXE server stack.
**Advantages**:
- Single configuration file
- Automatic service dependencies
- Rollback capability
- Integration with NixOS firewall
**Configuration Example**:
See the NixOS configuration example in [Quick Start](#quick-start).
### Option 2: Manual Installation
For non-NixOS systems or manual setup:
1. **Install required packages**:
```bash
# Debian/Ubuntu
apt-get install isc-dhcp-server atftpd nginx curl
# RHEL/CentOS
yum install dhcp tftp-server nginx curl
```
2. **Run setup script**:
```bash
sudo ./setup.sh --install --download
```
3. **Copy configuration files**:
```bash
# DHCP configuration
sudo cp dhcp/dhcpd.conf /etc/dhcp/dhcpd.conf
# Edit to match your network
sudo vim /etc/dhcp/dhcpd.conf
# Nginx configuration
sudo cp http/nginx.conf /etc/nginx/sites-available/pxe-boot
sudo ln -s /etc/nginx/sites-available/pxe-boot /etc/nginx/sites-enabled/
```
4. **Start services**:
```bash
sudo systemctl enable --now isc-dhcp-server
sudo systemctl enable --now atftpd
sudo systemctl enable --now nginx
```
5. **Configure firewall**:
```bash
# UFW (Ubuntu)
sudo ufw allow 67/udp # DHCP
sudo ufw allow 68/udp # DHCP
sudo ufw allow 69/udp # TFTP
sudo ufw allow 80/tcp # HTTP
# firewalld (RHEL)
sudo firewall-cmd --permanent --add-service=dhcp
sudo firewall-cmd --permanent --add-service=tftp
sudo firewall-cmd --permanent --add-service=http
sudo firewall-cmd --reload
```
## Configuration
### DHCP Configuration
The DHCP server configuration is in `dhcp/dhcpd.conf`. Key sections:
**Network Settings**:
```conf
subnet 10.0.100.0 netmask 255.255.255.0 {
range 10.0.100.100 10.0.100.200;
option routers 10.0.100.1;
option domain-name-servers 10.0.100.1, 8.8.8.8;
next-server 10.0.100.10; # PXE server IP
# ...
}
```
**Boot File Selection** (automatic BIOS/UEFI detection):
```conf
if exists user-class and option user-class = "iPXE" {
filename "http://10.0.100.10/boot/ipxe/boot.ipxe";
} elsif option architecture-type = 00:00 {
filename "undionly.kpxe"; # BIOS
} elsif option architecture-type = 00:07 {
filename "ipxe.efi"; # UEFI x86-64
}
```
**Host-Specific Configuration**:
```conf
host control-plane-01 {
hardware ethernet 52:54:00:12:34:56;
fixed-address 10.0.100.50;
option host-name "control-plane-01";
}
```
### iPXE Boot Script
The main boot script is `ipxe/boot.ipxe`. It provides:
1. **MAC-based automatic selection**:
```ipxe
iseq ${mac} 52:54:00:12:34:56 && set profile control-plane && goto boot ||
```
2. **Interactive menu** (if no MAC match):
```ipxe
:menu
menu Centra Cloud - Bare-Metal Provisioning
item control-plane 1. Control Plane Node (All Services)
item worker 2. Worker Node (Compute Services)
item all-in-one 3. All-in-One Node (Testing/Homelab)
```
3. **Kernel parameters**:
```ipxe
set kernel-params centra.profile=${profile}
set kernel-params ${kernel-params} centra.hostname=${hostname}
set kernel-params ${kernel-params} console=tty0 console=ttyS0,115200n8
```
### Adding New Nodes
To add a new node to the infrastructure:
1. **Get the MAC address** from the server (check BIOS or network card label)
2. **Add to MAC mappings** (`ipxe/mac-mappings.txt`):
```
52:54:00:12:34:5d worker worker-04
```
3. **Update boot script** (`ipxe/boot.ipxe`):
```ipxe
iseq ${mac} 52:54:00:12:34:5d && set profile worker && set hostname worker-04 && goto boot ||
```
4. **Add DHCP host entry** (`dhcp/dhcpd.conf`):
```conf
host worker-04 {
hardware ethernet 52:54:00:12:34:5d;
fixed-address 10.0.100.64;
option host-name "worker-04";
}
```
5. **Restart DHCP service**:
```bash
sudo systemctl restart dhcpd4
```
## Boot Profiles
### 1. Control Plane Profile
**Purpose**: Nodes that run core infrastructure services
**Services included**:
- FlareDB (PD, Store, TiKV-compatible database)
- IAM (Identity and Access Management)
- PlasmaVMC (Virtual Machine Controller)
- K8sHost (Kubernetes node agent)
- FlashDNS (High-performance DNS)
- ChainFire (Firewall/networking)
- Object Storage (S3-compatible)
- Monitoring (Prometheus, Grafana)
**Resource requirements**:
- CPU: 8+ cores recommended
- RAM: 32+ GB recommended
- Disk: 500+ GB SSD
**Use case**: Production control plane nodes in a cluster
### 2. Worker Profile
**Purpose**: Nodes that run customer workloads
**Services included**:
- K8sHost (Kubernetes node agent) - primary service
- PlasmaVMC (Virtual Machine Controller) - VM workloads
- ChainFire (Network policy enforcement)
- FlashDNS (Local DNS caching)
- Basic monitoring agents
**Resource requirements**:
- CPU: 16+ cores recommended
- RAM: 64+ GB recommended
- Disk: 1+ TB SSD
**Use case**: Worker nodes for running customer applications
### 3. All-in-One Profile
**Purpose**: Single-node deployment for testing and development
**Services included**:
- Complete Centra Cloud stack on one node
- All services from control-plane profile
- Suitable for testing, development, homelab
**Resource requirements**:
- CPU: 16+ cores recommended
- RAM: 64+ GB recommended
- Disk: 1+ TB SSD
**Use case**: Development, testing, homelab deployments
**Warning**: Not recommended for production use (no HA, resource intensive)
## Network Requirements
### Network Topology
The PXE server must be on the same network segment as the bare-metal servers, or you must configure DHCP relay.
**Same Segment** (recommended for initial setup):
```
┌──────────────┐ ┌──────────────────┐
│ PXE Server │ │ Bare-Metal Srv │
│ 10.0.100.10 │◄────────┤ (DHCP client) │
└──────────────┘ L2 SW └──────────────────┘
```
**Different Segments** (requires DHCP relay):
```
┌──────────────┐ ┌──────────┐ ┌──────────────────┐
│ PXE Server │ │ Router │ │ Bare-Metal Srv │
│ 10.0.100.10 │◄────────┤ (relay) │◄────────┤ (DHCP client) │
└──────────────┘ └──────────┘ └──────────────────┘
Segment A ip helper Segment B
```
### DHCP Relay Configuration
If your PXE server is on a different network segment:
**Cisco IOS**:
```
interface vlan 100
ip helper-address 10.0.100.10
```
**Linux (dhcp-helper)**:
```bash
apt-get install dhcp-helper
# Edit /etc/default/dhcp-helper
DHCPHELPER_OPTS="-s 10.0.100.10"
systemctl restart dhcp-helper
```
**Linux (dhcrelay)**:
```bash
apt-get install isc-dhcp-relay
dhcrelay -i eth0 -i eth1 10.0.100.10
```
### Firewall Rules
The following ports must be open on the PXE server:
| Port | Protocol | Service | Direction | Description |
|------|----------|---------|-----------|-------------|
| 67 | UDP | DHCP | Inbound | DHCP server |
| 68 | UDP | DHCP | Outbound | DHCP client responses |
| 69 | UDP | TFTP | Inbound | TFTP bootloader downloads |
| 80 | TCP | HTTP | Inbound | iPXE scripts and boot images |
| 443 | TCP | HTTPS | Inbound | Optional: secure boot images |
### Network Bandwidth
Estimated bandwidth requirements:
- Per-node boot: ~500 MB download (kernel + initrd)
- Concurrent boots: Multiply by number of simultaneous boots
- Recommended: 1 Gbps link for PXE server
Example: Booting 10 nodes simultaneously requires ~5 Gbps throughput burst, so stagger boots or use 10 Gbps link.
## Troubleshooting
### DHCP Issues
**Problem**: Server doesn't get IP address
**Diagnosis**:
```bash
# On PXE server, monitor DHCP requests
sudo tcpdump -i eth0 -n port 67 or port 68
# Check DHCP server logs
sudo journalctl -u dhcpd4 -f
# Verify DHCP server is running
sudo systemctl status dhcpd4
```
**Common causes**:
- DHCP server not running on correct interface
- Firewall blocking UDP 67/68
- Network cable/switch issue
- DHCP range exhausted
**Solution**:
```bash
# Check interface configuration
ip addr show
# Verify DHCP config syntax
sudo dhcpd -t -cf /etc/dhcp/dhcpd.conf
# Check firewall
sudo iptables -L -n | grep -E "67|68"
# Restart DHCP server
sudo systemctl restart dhcpd4
```
### TFTP Issues
**Problem**: PXE client gets IP but fails to download bootloader
**Diagnosis**:
```bash
# Monitor TFTP requests
sudo tcpdump -i eth0 -n port 69
# Check TFTP server logs
sudo journalctl -u atftpd -f
# Test TFTP locally
tftp localhost -c get undionly.kpxe /tmp/test.kpxe
```
**Common causes**:
- TFTP server not running
- Bootloader files missing
- Permissions incorrect
- Firewall blocking UDP 69
**Solution**:
```bash
# Check files exist
ls -la /var/lib/tftpboot/
# Fix permissions
sudo chmod 644 /var/lib/tftpboot/*.{kpxe,efi}
# Restart TFTP server
sudo systemctl restart atftpd
# Check firewall
sudo iptables -L -n | grep 69
```
### HTTP Issues
**Problem**: iPXE loads but can't download boot script or kernel
**Diagnosis**:
```bash
# Monitor HTTP requests
sudo tail -f /var/log/nginx/access.log
# Test HTTP locally
curl -v http://localhost/boot/ipxe/boot.ipxe
curl -v http://localhost/health
# Check nginx status
sudo systemctl status nginx
```
**Common causes**:
- Nginx not running
- Boot files missing
- Permissions incorrect
- Firewall blocking TCP 80
- Wrong server IP in boot.ipxe
**Solution**:
```bash
# Check nginx config
sudo nginx -t
# Verify files exist
ls -la /var/lib/pxe-boot/ipxe/
ls -la /var/lib/pxe-boot/nixos/
# Fix permissions
sudo chown -R nginx:nginx /var/lib/pxe-boot
sudo chmod -R 755 /var/lib/pxe-boot
# Restart nginx
sudo systemctl restart nginx
```
### Boot Script Issues
**Problem**: Boot menu appears but fails to load kernel
**Diagnosis**:
- Check iPXE error messages on console
- Verify URLs in boot.ipxe match actual paths
- Test kernel download manually:
```bash
curl -I http://10.0.100.10/boot/nixos/bzImage
```
**Common causes**:
- NixOS boot images not deployed yet (normal for T032.S2)
- Wrong paths in boot.ipxe
- Files too large (check disk space)
**Solution**:
```bash
# Wait for T032.S3 (Image Builder) to generate boot images
# OR manually place NixOS netboot images:
sudo mkdir -p /var/lib/pxe-boot/nixos
# Copy bzImage and initrd from NixOS netboot
```
### Serial Console Debugging
For remote debugging without physical access:
1. **Enable serial console in BIOS**:
- Configure COM1/ttyS0 at 115200 baud
- Enable console redirection
2. **Connect via IPMI SOL** (if available):
```bash
ipmitool -I lanplus -H <bmc-ip> -U admin sol activate
```
3. **Watch boot process**:
- DHCP discovery messages
- TFTP download progress
- iPXE boot menu
- Kernel boot messages
4. **Kernel parameters include serial console**:
```
console=tty0 console=ttyS0,115200n8
```
### Common Error Messages
| Error | Cause | Solution |
|-------|-------|----------|
| `PXE-E51: No DHCP or proxyDHCP offers were received` | DHCP server not responding | Check DHCP server running, network connectivity |
| `PXE-E53: No boot filename received` | DHCP not providing filename | Check dhcpd.conf has `filename` option |
| `PXE-E32: TFTP open timeout` | TFTP server not responding | Check TFTP server running, firewall rules |
| `Not found: /boot/ipxe/boot.ipxe` | HTTP 404 error | Check file exists, nginx config, permissions |
| `Could not boot: Exec format error` | Corrupted boot file | Re-download/rebuild bootloader |
## Advanced Topics
### Building iPXE from Source
For production deployments, building iPXE from source provides:
- Custom branding
- Embedded certificates for HTTPS
- Optimized size
- Security hardening
**Build instructions**:
```bash
sudo ./setup.sh --build-ipxe
```
Or manually:
```bash
git clone https://github.com/ipxe/ipxe.git
cd ipxe/src
# BIOS bootloader
make bin/undionly.kpxe
# UEFI bootloader
make bin-x86_64-efi/ipxe.efi
# Copy to PXE server
sudo cp bin/undionly.kpxe /var/lib/pxe-boot/ipxe/
sudo cp bin-x86_64-efi/ipxe.efi /var/lib/pxe-boot/ipxe/
```
### HTTPS Boot (Secure Boot)
For enhanced security, serve boot images over HTTPS:
1. **Generate SSL certificate**:
```bash
sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
-keyout /etc/ssl/private/pxe-server.key \
-out /etc/ssl/certs/pxe-server.crt
```
2. **Configure nginx for HTTPS** (uncomment HTTPS block in `http/nginx.conf`)
3. **Update boot.ipxe** to use `https://` URLs
4. **Rebuild iPXE with embedded certificate** (for secure boot without prompts)
### Multiple NixOS Versions
To support multiple NixOS versions for testing/rollback:
```
/var/lib/pxe-boot/nixos/
├── 24.05/
│ ├── bzImage
│ └── initrd
├── 24.11/
│ ├── bzImage
│ └── initrd
└── latest -> 24.11/ # Symlink to current version
```
Update `boot.ipxe` to use `/boot/nixos/latest/bzImage` or add menu items for version selection.
### Integration with BMC/IPMI
For fully automated provisioning:
1. **Discover new hardware** via IPMI/Redfish API
2. **Configure PXE boot** via IPMI:
```bash
ipmitool -I lanplus -H <bmc-ip> -U admin chassis bootdev pxe options=persistent
```
3. **Power on server**:
```bash
ipmitool -I lanplus -H <bmc-ip> -U admin power on
```
4. **Monitor via SOL** (serial-over-LAN)
### Monitoring and Metrics
Track PXE boot activity:
1. **DHCP leases**:
```bash
cat /var/lib/dhcp/dhcpd.leases
```
2. **HTTP access logs**:
```bash
sudo tail -f /var/log/nginx/access.log | grep -E "boot.ipxe|bzImage|initrd"
```
3. **Prometheus metrics** (if nginx-module-vts installed):
- Boot file download counts
- Bandwidth usage
- Response times
4. **Custom metrics endpoint**:
- Parse nginx access logs
- Count boots per profile
- Alert on failed boots
## Files and Directory Structure
```
baremetal/pxe-server/
├── README.md # This file
├── setup.sh # Setup and management script
├── nixos-module.nix # NixOS service module
├── dhcp/
│ └── dhcpd.conf # DHCP server configuration
├── ipxe/
│ ├── boot.ipxe # Main boot menu script
│ └── mac-mappings.txt # MAC address documentation
├── http/
│ ├── nginx.conf # HTTP server configuration
│ └── directory-structure.txt # Directory layout documentation
└── assets/ # (Created at runtime)
└── /var/lib/pxe-boot/
├── ipxe/
│ ├── undionly.kpxe
│ ├── ipxe.efi
│ └── boot.ipxe
└── nixos/
├── bzImage
└── initrd
```
## Next Steps
After completing the PXE server setup:
1. **T032.S3 - Image Builder**: Automated NixOS image generation with profile-specific configurations
2. **T032.S4 - Provisioning Orchestrator**: API-driven provisioning workflow and node lifecycle management
3. **Integration with IAM**: Authentication for provisioning API
4. **Integration with FlareDB**: Node inventory and state management
## References
- [iPXE Documentation](https://ipxe.org/)
- [ISC DHCP Documentation](https://www.isc.org/dhcp/)
- [NixOS Manual - Netboot](https://nixos.org/manual/nixos/stable/index.html#sec-building-netboot)
- [PXE Specification](https://www.intel.com/content/www/us/en/architecture-and-technology/intel-boot-executive.html)
## Support
For issues or questions:
- Check [Troubleshooting](#troubleshooting) section
- Review logs: `sudo journalctl -u dhcpd4 -u atftpd -u nginx -f`
- Run diagnostic: `sudo ./setup.sh --test`
## License
Part of Centra Cloud infrastructure - see project root for license information.