9.6 KiB
Testing
UltraCloud treats VM-first validation as the canonical local proof path and keeps the public support contract limited to three profiles.
Canonical Profiles
| Profile | Primary outputs | Required components | Optional components |
|---|---|---|---|
single-node dev |
nix run .#single-node-quickstart, nixosConfigurations.single-node-quickstart, companion install image nixosConfigurations.netboot-all-in-one |
chainfire, flaredb, iam, plasmavmc, prismnet |
lightningstor, coronafs, flashdns, fiberlb, apigateway, nightlight, creditservice, k8shost, deployer |
3-node HA control plane |
nixosConfigurations.node01, node02, node03, netboot-control-plane |
chainfire, flaredb, iam, nix-agent on every control-plane node, plus deployer on the bootstrap node |
fleet-scheduler, node-agent, prismnet, flashdns, fiberlb, plasmavmc, lightningstor, coronafs, k8shost, apigateway, nightlight, creditservice |
bare-metal bootstrap |
nixosConfigurations.ultracloud-iso, nixosConfigurations.baremetal-qemu-control-plane, nixosConfigurations.baremetal-qemu-worker, checks.x86_64-linux.baremetal-iso-e2e |
deployer, first-boot-automation, install-target, nix-agent |
netboot-control-plane, netboot-worker, and netboot-all-in-one as experimental helper images, plus node-agent, fleet-scheduler, and higher-level storage or edge services after bootstrap |
Quickstart Smoke
nix flake show . --all-systems | rg -n "single|all-in-one|quickstart"
nix eval --no-eval-cache .#nixosConfigurations.single-node-quickstart.config.system.build.toplevel.drvPath --raw
nix run .#single-node-quickstart
single-node-quickstart is the supported one-box entrypoint. It boots the minimal VM stack under QEMU, waits for chainfire, flaredb, iam, prismnet, and plasmavmc, and verifies their health from inside the guest. The launcher uses the generated NixOS VM runner, so it can fall back to TCG when /dev/kvm is absent.
For debugging, keep the VM alive after the smoke passes:
ULTRACLOUD_QUICKSTART_KEEP_VM=1 nix run .#single-node-quickstart
Canonical Bare-Metal Proof
nix eval --no-eval-cache .#nixosConfigurations.baremetal-qemu-control-plane.config.system.build.toplevel.drvPath --raw
nix eval --no-eval-cache .#nixosConfigurations.baremetal-qemu-worker.config.system.build.toplevel.drvPath --raw
nix run ./nix/test-cluster#cluster -- baremetal-iso
nix build .#checks.x86_64-linux.baremetal-iso-e2e
baremetal-iso is the canonical install path for QEMU-as-bare-metal validation. It boots nixosConfigurations.ultracloud-iso, waits for /api/v1/phone-home, downloads the flake bundle from deployer, runs Disko, reboots, confirms the first post-install boot markers, and waits for nix-agent to report the desired system as active for both baremetal-qemu-control-plane and baremetal-qemu-worker. baremetal-iso-e2e runs the same flow under flake check.
Regression Guards
nix build .#checks.x86_64-linux.canonical-profile-eval-guards
nix build .#checks.x86_64-linux.canonical-profile-build-guards
These two checks are the fast fail-first drift gates for the supported surface:
canonical-profile-eval-guards: forces evaluation of every canonical profile output, includingnetboot-workerandnetboot-all-in-one, so broken attrs fail before any long-running harness work starts.canonical-profile-build-guards: realizes the canonical VM, ISO, control-plane, and helper-image outputs so build-time drift is caught even when a cluster harness is not running.
Portable Local Proof
nix build .#checks.x86_64-linux.canonical-profile-eval-guards
nix build .#checks.x86_64-linux.portable-control-plane-regressions
Use this lane on Linux hosts that do not expose /dev/kvm:
portable-control-plane-regressions: TCG-safe aggregate check that keeps the canonical profile eval guard,deployer-bootstrap-e2e,host-lifecycle-e2e,deployer-vm-smoke, andfleet-scheduler-e2egreen together.- It intentionally does not boot the six-node nested-KVM VM suite, so it is a developer regression path, not the publishable multi-node proof.
- CI runs
canonical-profile-eval-guardsandportable-control-plane-regressionson every relevant change from.github/workflows/nix.yml.
Publishable Checks
nix run .#single-node-quickstart
nix run ./nix/test-cluster#cluster -- baremetal-iso
nix run ./nix/test-cluster#cluster -- fresh-smoke
nix run ./nix/test-cluster#cluster -- fresh-demo-vm-webapp
nix run ./nix/test-cluster#cluster -- fresh-matrix
./nix/test-cluster/run-publishable-kvm-suite.sh ./work/publishable-kvm-suite
nix build .#checks.x86_64-linux.baremetal-iso-e2e
nix build .#checks.x86_64-linux.deployer-vm-smoke
Use these commands as the release-facing local proof set:
single-node-quickstart: productized one-command quickstart gate for the minimal VM platform profilebaremetal-iso: canonical bare-metal bootstrap gate covering pre-install boot, phone-home, flake bundle fetch, Disko install, reboot, post-install boot, and desired-system activation on one control-plane node plus one worker-equivalent nodefresh-smoke: base VM-cluster gate for the canonical multi-node topology, including readiness, core behavior, and fault injectionfresh-demo-vm-webapp: optional VM-hosting bundle proof forplasmavmc + prismnetwith state persisted throughlightningstorfresh-matrix: optional composition proof for provider bundles such asprismnet + flashdns + fiberlbandplasmavmc + coronafs + lightningstorrun-publishable-kvm-suite.sh: reproducible wrapper that captures the KVM environment and runs the full publishable nested-KVM trio in a single commandbaremetal-iso-e2e: flake-check wrapper around the same canonical ISO harnessdeployer-vm-smoke: lightweight regression proving thatnix-agentcan activate a host-built target closure without guest-side compilation
The repository-owned remote entrypoint for the same publishable KVM proof is .github/workflows/kvm-publishable-selfhosted.yml. It targets Forgejo runners labeled nix-host and expects /dev/kvm plus nested virtualization on those hosts.
Responsibility Coverage
baremetal-isoandbaremetal-iso-e2eare the canonical proof fordeployer -> installer -> nix-agent. They cover phone-home, install-plan materialization, Disko, reboot, and desired-system activation.deployer-vm-smokeis the smallest regression for the samedeployer -> nix-agentboundary. It proves that a node can receive a prebuilt target closure and activate it without guest-side compilation.portable-control-plane-regressionskeeps the main non-KVM-safe boundaries under continuous coverage by composingdeployer-bootstrap-e2e,host-lifecycle-e2e,deployer-vm-smoke, andfleet-scheduler-e2ebehind the canonical profile eval guard.fresh-smokeandfresh-matrixare the canonical proof fordeployer -> fleet-scheduler -> node-agent. They cover native service placement, heartbeats, failover, and runtime reconciliation.fresh-smokealso coversk8shostseparately fromfleet-scheduler:k8shostexposes tenant pod and service semantics, whilefleet-schedulerhandles bare-metal host services.
The three fresh-* VM-cluster commands are the publishable nested-KVM suite. They require a Linux host with /dev/kvm and nested virtualization, and the harness stops at preflight by design when that device is absent. single-node-quickstart, baremetal-iso, baremetal-iso-e2e, deployer-vm-smoke, and portable-control-plane-regressions can run on TCG-only hosts, but they are slower without host KVM.
Release-facing completion now requires both of these to be green on the same branch:
- the canonical bare-metal proof:
nix run ./nix/test-cluster#cluster -- baremetal-isoplusnix build .#checks.x86_64-linux.baremetal-iso-e2e - the publishable nested-KVM suite:
fresh-smoke,fresh-demo-vm-webapp, andfresh-matrix, preferably through./nix/test-cluster/run-publishable-kvm-suite.sh
Extended Measurements
nix run ./nix/test-cluster#cluster -- fresh-bench-storage
fresh-bench-storage remains useful for storage regression tracking, but it is a benchmark path, not part of the minimal canonical publish gate.
Operational Commands
nix run ./nix/test-cluster#cluster -- status
nix run ./nix/test-cluster#cluster -- logs node01
nix run ./nix/test-cluster#cluster -- ssh node04
nix run ./nix/test-cluster#cluster -- demo-vm-webapp
nix run ./nix/test-cluster#cluster -- serve-vm-webapp
nix run ./nix/test-cluster#cluster -- matrix
nix run ./nix/test-cluster#cluster -- bench-storage
nix run ./nix/test-cluster#cluster -- fresh-matrix
nix run ./nix/test-cluster#cluster -- fresh-bench-storage
nix run ./nix/test-cluster#cluster -- stop
nix run ./nix/test-cluster#cluster -- clean
Validation Philosophy
- package unit tests are useful but not sufficient
- host-built VM clusters are the main integration signal
- bootstrap and rollout paths must stay evaluable independently of the larger VM-hosting feature set
- distributed storage and virtualization paths must be checked under failure, not only at steady state
Legacy And Experimental Paths
baremetal/vm-clustermanual launch scripts arelegacy/manual, not canonical validation- direct
nix develop ./nix/test-cluster -c ./nix/test-cluster/run-cluster.sh ...usage is a debugging path, not the publishable entrypoint netboot-control-plane,netboot-worker,netboot-all-in-one,netboot-base,pxe-server, and other helper images are internal or experimental building blocks, not supported profiles by themselves