# PlasmaCloud NixOS Image Builder This directory contains tools and configurations for building bootable NixOS netboot images for bare-metal provisioning of PlasmaCloud infrastructure. ## Overview The NixOS Image Builder generates netboot images (kernel + initrd) that can be served via PXE/iPXE to provision bare-metal servers with PlasmaCloud services. These images integrate with the T024 NixOS service modules and the T032.S2 PXE boot infrastructure. ## Architecture The image builder produces three deployment profiles: ### 1. Control Plane (`netboot-control-plane`) Full control plane deployment with all 8 PlasmaCloud services: - **Chainfire**: Distributed configuration and coordination - **FlareDB**: Time-series metrics and events database - **IAM**: Identity and access management - **PlasmaVMC**: Virtual machine control plane - **NovaNET**: Software-defined networking controller - **FlashDNS**: High-performance DNS server - **FiberLB**: Layer 4/7 load balancer - **LightningStor**: Distributed block storage - **K8sHost**: Kubernetes hosting component **Use Cases**: - Multi-node production clusters (3+ control plane nodes) - High-availability deployments - Separation of control and data planes ### 2. Worker (`netboot-worker`) Compute-focused deployment for running tenant workloads: - **PlasmaVMC**: Virtual machine control plane - **NovaNET**: Software-defined networking **Use Cases**: - Worker nodes in multi-node clusters - Dedicated compute capacity - Scalable VM hosting ### 3. All-in-One (`netboot-all-in-one`) Single-node deployment with all 8 services: - All services from Control Plane profile - Optimized for single-node operation **Use Cases**: - Development/testing environments - Small deployments (1-3 nodes) - Edge locations - Proof-of-concept installations ## Prerequisites ### Build Environment - **NixOS** or **Nix package manager** installed - **Flakes** enabled in Nix configuration - **Git** access to PlasmaCloud repository - **Sufficient disk space**: ~10GB for build artifacts ### Enable Nix Flakes If not already enabled, add to `/etc/nix/nix.conf` or `~/.config/nix/nix.conf`: ``` experimental-features = nix-command flakes ``` ### Build Dependencies The build process automatically handles all dependencies, but ensure you have: - Working internet connection (for Nix binary cache) - ~4GB RAM minimum - ~10GB free disk space ## Build Instructions ### Quick Start Build all profiles: ```bash cd /home/centra/cloud/baremetal/image-builder ./build-images.sh ``` Build a specific profile: ```bash # Control plane only ./build-images.sh --profile control-plane # Worker nodes only ./build-images.sh --profile worker # All-in-one deployment ./build-images.sh --profile all-in-one ``` Custom output directory: ```bash ./build-images.sh --output-dir /srv/pxe/images ``` ### Build Output Each profile generates: - `bzImage` - Linux kernel (~10-30 MB) - `initrd` - Initial ramdisk (~100-300 MB) - `netboot.ipxe` - iPXE boot script - `build.log` - Build log for troubleshooting Artifacts are placed in: ``` ./artifacts/ ├── control-plane/ │ ├── bzImage │ ├── initrd │ ├── netboot.ipxe │ └── build.log ├── worker/ │ ├── bzImage │ ├── initrd │ ├── netboot.ipxe │ └── build.log └── all-in-one/ ├── bzImage ├── initrd ├── netboot.ipxe └── build.log ``` ### Manual Build Commands You can also build images directly with Nix: ```bash # Build initrd nix build .#nixosConfigurations.netboot-control-plane.config.system.build.netbootRamdisk # Build kernel nix build .#nixosConfigurations.netboot-control-plane.config.system.build.kernel # Access artifacts ls -lh result/ ``` ## Deployment ### Integration with PXE Server (T032.S2) The build script automatically copies artifacts to the PXE server directory if it exists: ``` chainfire/baremetal/pxe-server/assets/nixos/ ├── control-plane/ ├── worker/ ├── all-in-one/ ├── bzImage-control-plane -> control-plane/bzImage ├── initrd-control-plane -> control-plane/initrd ├── bzImage-worker -> worker/bzImage └── initrd-worker -> worker/initrd ``` ### Manual Deployment Copy artifacts to your PXE/HTTP server: ```bash # Example: Deploy to nginx serving directory sudo cp -r ./artifacts/control-plane /srv/pxe/nixos/ sudo cp -r ./artifacts/worker /srv/pxe/nixos/ sudo cp -r ./artifacts/all-in-one /srv/pxe/nixos/ ``` ### iPXE Boot Configuration Reference the images in your iPXE boot script: ```ipxe #!ipxe set boot-server 10.0.0.2:8080 :control-plane kernel http://${boot-server}/nixos/control-plane/bzImage init=/nix/store/*/init console=ttyS0,115200 console=tty0 loglevel=4 initrd http://${boot-server}/nixos/control-plane/initrd boot :worker kernel http://${boot-server}/nixos/worker/bzImage init=/nix/store/*/init console=ttyS0,115200 console=tty0 loglevel=4 initrd http://${boot-server}/nixos/worker/initrd boot ``` ## Customization ### Adding Services To add a service to a profile, edit the corresponding configuration: ```nix # nix/images/netboot-control-plane.nix environment.systemPackages = with pkgs; [ chainfire-server flaredb-server # ... existing services ... my-custom-service # Add your service ]; ``` ### Custom Kernel Configuration Modify `nix/images/netboot-base.nix`: ```nix boot.kernelPackages = pkgs.linuxPackages_6_6; # Specific kernel version boot.kernelModules = [ "my-driver" ]; # Additional modules boot.kernelParams = [ "my-param=value" ]; # Additional kernel parameters ``` ### Additional Packages Add packages to the netboot environment: ```nix # nix/images/netboot-base.nix environment.systemPackages = with pkgs; [ # ... existing packages ... # Your additions python3 nodejs custom-tool ]; ``` ### Hardware-Specific Configuration See `examples/hardware-specific.nix` for hardware-specific customizations. ## Troubleshooting ### Build Failures **Symptom**: Build fails with Nix errors **Solutions**: 1. Check build log: `cat artifacts/PROFILE/build.log` 2. Verify Nix flakes are enabled 3. Update nixpkgs: `nix flake update` 4. Clear Nix store cache: `nix-collect-garbage -d` ### Missing Service Packages **Symptom**: Error: "package not found" **Solutions**: 1. Verify service is built: `nix build .#chainfire-server` 2. Check flake overlay: `nix flake show` 3. Rebuild all packages: `nix build .#default` ### Image Too Large **Symptom**: Initrd > 500 MB **Solutions**: 1. Remove unnecessary packages from `environment.systemPackages` 2. Disable documentation (already done in base config) 3. Use minimal kernel: `boot.kernelPackages = pkgs.linuxPackages_latest_hardened` ### PXE Boot Fails **Symptom**: Server fails to boot netboot image **Solutions**: 1. Verify artifacts are accessible via HTTP 2. Check iPXE script syntax 3. Verify kernel parameters in boot script 4. Check serial console output (ttyS0) 5. Ensure DHCP provides correct boot server IP ### SSH Access Issues **Symptom**: Cannot SSH to netboot installer **Solutions**: 1. Replace example SSH key in `nix/images/netboot-base.nix` 2. Verify network connectivity (DHCP, firewall) 3. Check SSH service is running: `systemctl status sshd` ## Configuration Reference ### Service Modules (T024 Integration) All netboot profiles import PlasmaCloud service modules from `nix/modules/`: - `chainfire.nix` - Chainfire configuration - `flaredb.nix` - FlareDB configuration - `iam.nix` - IAM configuration - `plasmavmc.nix` - PlasmaVMC configuration - `novanet.nix` - NovaNET configuration - `flashdns.nix` - FlashDNS configuration - `fiberlb.nix` - FiberLB configuration - `lightningstor.nix` - LightningStor configuration - `k8shost.nix` - K8sHost configuration Services are **disabled by default** in netboot images and enabled in final installed configurations. ### Netboot Base Configuration Located at `nix/images/netboot-base.nix`, provides: - SSH server with root access (key-based) - Generic kernel with broad hardware support - Disk management tools (disko, parted, cryptsetup, lvm2) - Network tools (iproute2, curl, tcpdump) - Serial console support (ttyS0, tty0) - DHCP networking - Minimal system configuration ### Profile Configurations - `nix/images/netboot-control-plane.nix` - All 8 services - `nix/images/netboot-worker.nix` - Compute services (PlasmaVMC, NovaNET) - `nix/images/netboot-all-in-one.nix` - All services for single-node ## Security Considerations ### SSH Keys **IMPORTANT**: The default SSH key in `netboot-base.nix` is an example placeholder. You MUST replace it with your actual provisioning key: ```nix users.users.root.openssh.authorizedKeys.keys = [ "ssh-ed25519 AAAAC3Nza... your-provisioning-key@host" ]; ``` Generate a new key: ```bash ssh-keygen -t ed25519 -C "provisioning@plasmacloud" ``` ### Network Security - Netboot images have **firewall disabled** for installation phase - Use isolated provisioning VLAN for PXE boot - Implement MAC address whitelist in DHCP - Enable firewall in final installed configurations ### Secrets Management - Do NOT embed secrets in netboot images - Use nixos-anywhere to inject secrets during installation - Store secrets in `/etc/nixos/secrets/` on installed systems - Use proper file permissions (0400 for keys) ## Next Steps After building images: 1. **Deploy to PXE Server**: Copy artifacts to HTTP server 2. **Configure DHCP/iPXE**: Set up boot infrastructure (see T032.S2) 3. **Prepare Node Configurations**: Create per-node configs for nixos-anywhere 4. **Test Boot Process**: Verify PXE boot on test hardware 5. **Run nixos-anywhere**: Install NixOS on target machines ## Resources - **Design Document**: `docs/por/T032-baremetal-provisioning/design.md` - **PXE Infrastructure**: `chainfire/baremetal/pxe-server/` - **Service Modules**: `nix/modules/` - **Example Configurations**: `baremetal/image-builder/examples/` ## Support For issues or questions: 1. Check build logs: `artifacts/PROFILE/build.log` 2. Review design document: `docs/por/T032-baremetal-provisioning/design.md` 3. Examine example configurations: `examples/` 4. Verify service module configuration: `nix/modules/` ## License Apache 2.0 - See LICENSE file for details