139 lines
9.4 KiB
Markdown
139 lines
9.4 KiB
Markdown
# Testing
|
|
|
|
UltraCloud treats VM-first validation as the canonical local proof path and keeps the public support contract limited to three profiles.
|
|
|
|
## Canonical Profiles
|
|
|
|
| Profile | Primary outputs | Required components | Optional components |
|
|
| --- | --- | --- | --- |
|
|
| `single-node dev` | `nix run .#single-node-quickstart`, `nixosConfigurations.single-node-quickstart`, companion install image `nixosConfigurations.netboot-all-in-one` | `chainfire`, `flaredb`, `iam`, `plasmavmc`, `prismnet` | `lightningstor`, `coronafs`, `flashdns`, `fiberlb`, `apigateway`, `nightlight`, `creditservice`, `k8shost`, `deployer` |
|
|
| `3-node HA control plane` | `nixosConfigurations.node01`, `node02`, `node03`, `netboot-control-plane` | `chainfire`, `flaredb`, `iam`, `nix-agent` on every control-plane node, plus `deployer` on the bootstrap node | `fleet-scheduler`, `node-agent`, `prismnet`, `flashdns`, `fiberlb`, `plasmavmc`, `lightningstor`, `coronafs`, `k8shost`, `apigateway`, `nightlight`, `creditservice` |
|
|
| `bare-metal bootstrap` | `nixosConfigurations.ultracloud-iso`, `nixosConfigurations.baremetal-qemu-control-plane`, `nixosConfigurations.baremetal-qemu-worker`, `checks.x86_64-linux.baremetal-iso-e2e` | `deployer`, `first-boot-automation`, `install-target`, `nix-agent` | `netboot-control-plane`, `netboot-worker`, and `netboot-all-in-one` as experimental helper images, plus `node-agent`, `fleet-scheduler`, and higher-level storage or edge services after bootstrap |
|
|
|
|
## Quickstart Smoke
|
|
|
|
```bash
|
|
nix flake show . --all-systems | rg -n "single|all-in-one|quickstart"
|
|
nix eval --no-eval-cache .#nixosConfigurations.single-node-quickstart.config.system.build.toplevel.drvPath --raw
|
|
nix run .#single-node-quickstart
|
|
```
|
|
|
|
`single-node-quickstart` is the supported one-box entrypoint. It boots the minimal VM stack under QEMU, waits for `chainfire`, `flaredb`, `iam`, `prismnet`, and `plasmavmc`, and verifies their health from inside the guest. The launcher uses the generated NixOS VM runner, so it can fall back to TCG when `/dev/kvm` is absent.
|
|
|
|
For debugging, keep the VM alive after the smoke passes:
|
|
|
|
```bash
|
|
ULTRACLOUD_QUICKSTART_KEEP_VM=1 nix run .#single-node-quickstart
|
|
```
|
|
|
|
## Canonical Bare-Metal Proof
|
|
|
|
```bash
|
|
nix eval --no-eval-cache .#nixosConfigurations.baremetal-qemu-control-plane.config.system.build.toplevel.drvPath --raw
|
|
nix eval --no-eval-cache .#nixosConfigurations.baremetal-qemu-worker.config.system.build.toplevel.drvPath --raw
|
|
nix run ./nix/test-cluster#cluster -- baremetal-iso
|
|
nix build .#checks.x86_64-linux.baremetal-iso-e2e
|
|
```
|
|
|
|
`baremetal-iso` is the canonical install path for QEMU-as-bare-metal validation. It boots `nixosConfigurations.ultracloud-iso`, waits for `/api/v1/phone-home`, downloads the flake bundle from `deployer`, runs Disko, reboots, confirms the first post-install boot markers, and waits for `nix-agent` to report the desired system as `active` for both `baremetal-qemu-control-plane` and `baremetal-qemu-worker`. `baremetal-iso-e2e` runs the same flow under `flake check`.
|
|
|
|
## Regression Guards
|
|
|
|
```bash
|
|
nix build .#checks.x86_64-linux.canonical-profile-eval-guards
|
|
nix build .#checks.x86_64-linux.canonical-profile-build-guards
|
|
```
|
|
|
|
These two checks are the fast fail-first drift gates for the supported surface:
|
|
|
|
- `canonical-profile-eval-guards`: forces evaluation of every canonical profile output, including `netboot-worker` and `netboot-all-in-one`, so broken attrs fail before any long-running harness work starts.
|
|
- `canonical-profile-build-guards`: realizes the canonical VM, ISO, control-plane, and helper-image outputs so build-time drift is caught even when a cluster harness is not running.
|
|
|
|
## Portable Local Proof
|
|
|
|
```bash
|
|
nix build .#checks.x86_64-linux.canonical-profile-eval-guards
|
|
nix build .#checks.x86_64-linux.portable-control-plane-regressions
|
|
```
|
|
|
|
Use this lane on Linux hosts that do not expose `/dev/kvm`:
|
|
|
|
- `portable-control-plane-regressions`: TCG-safe aggregate check that keeps the canonical profile eval guard, `deployer-bootstrap-e2e`, `host-lifecycle-e2e`, `deployer-vm-smoke`, and `fleet-scheduler-e2e` green together.
|
|
- It intentionally does not boot the six-node nested-KVM VM suite, so it is a developer regression path, not the publishable multi-node proof.
|
|
- CI runs `canonical-profile-eval-guards` and `portable-control-plane-regressions` on every relevant change from `.github/workflows/nix.yml`.
|
|
|
|
## Publishable Checks
|
|
|
|
```bash
|
|
nix run .#single-node-quickstart
|
|
nix run ./nix/test-cluster#cluster -- baremetal-iso
|
|
nix run ./nix/test-cluster#cluster -- fresh-smoke
|
|
nix run ./nix/test-cluster#cluster -- fresh-demo-vm-webapp
|
|
nix run ./nix/test-cluster#cluster -- fresh-matrix
|
|
./nix/test-cluster/run-publishable-kvm-suite.sh ./work/publishable-kvm-suite
|
|
nix build .#checks.x86_64-linux.baremetal-iso-e2e
|
|
nix build .#checks.x86_64-linux.deployer-vm-smoke
|
|
```
|
|
|
|
Use these commands as the release-facing local proof set:
|
|
|
|
- `single-node-quickstart`: productized one-command quickstart gate for the minimal VM platform profile
|
|
- `baremetal-iso`: canonical bare-metal bootstrap gate covering pre-install boot, phone-home, flake bundle fetch, Disko install, reboot, post-install boot, and desired-system activation on one control-plane node plus one worker-equivalent node
|
|
- `fresh-smoke`: base VM-cluster gate for the canonical multi-node topology, including readiness, core behavior, and fault injection
|
|
- `fresh-demo-vm-webapp`: optional VM-hosting bundle proof for `plasmavmc + prismnet` with state persisted through `lightningstor`
|
|
- `fresh-matrix`: optional composition proof for provider bundles such as `prismnet + flashdns + fiberlb` and `plasmavmc + coronafs + lightningstor`
|
|
- `run-publishable-kvm-suite.sh`: reproducible wrapper that captures the KVM environment and runs the full publishable nested-KVM trio in a single command
|
|
- `baremetal-iso-e2e`: flake-check wrapper around the same canonical ISO harness
|
|
- `deployer-vm-smoke`: lightweight regression proving that `nix-agent` can activate a host-built target closure without guest-side compilation
|
|
|
|
## Responsibility Coverage
|
|
|
|
- `baremetal-iso` and `baremetal-iso-e2e` are the canonical proof for `deployer -> installer -> nix-agent`. They cover phone-home, install-plan materialization, Disko, reboot, and desired-system activation.
|
|
- `deployer-vm-smoke` is the smallest regression for the same `deployer -> nix-agent` boundary. It proves that a node can receive a prebuilt target closure and activate it without guest-side compilation.
|
|
- `portable-control-plane-regressions` keeps the main non-KVM-safe boundaries under continuous coverage by composing `deployer-bootstrap-e2e`, `host-lifecycle-e2e`, `deployer-vm-smoke`, and `fleet-scheduler-e2e` behind the canonical profile eval guard.
|
|
- `fresh-smoke` and `fresh-matrix` are the canonical proof for `deployer -> fleet-scheduler -> node-agent`. They cover native service placement, heartbeats, failover, and runtime reconciliation.
|
|
- `fresh-smoke` also covers `k8shost` separately from `fleet-scheduler`: `k8shost` exposes tenant pod and service semantics, while `fleet-scheduler` handles bare-metal host services.
|
|
|
|
The three `fresh-*` VM-cluster commands are the publishable nested-KVM suite. They require a Linux host with `/dev/kvm` and nested virtualization, and the harness stops at preflight by design when that device is absent. `single-node-quickstart`, `baremetal-iso`, `baremetal-iso-e2e`, `deployer-vm-smoke`, and `portable-control-plane-regressions` can run on TCG-only hosts, but they are slower without host KVM.
|
|
|
|
Release-facing completion now requires both of these to be green on the same branch:
|
|
|
|
- the canonical bare-metal proof: `nix run ./nix/test-cluster#cluster -- baremetal-iso` plus `nix build .#checks.x86_64-linux.baremetal-iso-e2e`
|
|
- the publishable nested-KVM suite: `fresh-smoke`, `fresh-demo-vm-webapp`, and `fresh-matrix`, preferably through `./nix/test-cluster/run-publishable-kvm-suite.sh`
|
|
|
|
## Extended Measurements
|
|
|
|
```bash
|
|
nix run ./nix/test-cluster#cluster -- fresh-bench-storage
|
|
```
|
|
|
|
`fresh-bench-storage` remains useful for storage regression tracking, but it is a benchmark path, not part of the minimal canonical publish gate.
|
|
|
|
## Operational Commands
|
|
|
|
```bash
|
|
nix run ./nix/test-cluster#cluster -- status
|
|
nix run ./nix/test-cluster#cluster -- logs node01
|
|
nix run ./nix/test-cluster#cluster -- ssh node04
|
|
nix run ./nix/test-cluster#cluster -- demo-vm-webapp
|
|
nix run ./nix/test-cluster#cluster -- serve-vm-webapp
|
|
nix run ./nix/test-cluster#cluster -- matrix
|
|
nix run ./nix/test-cluster#cluster -- bench-storage
|
|
nix run ./nix/test-cluster#cluster -- fresh-matrix
|
|
nix run ./nix/test-cluster#cluster -- fresh-bench-storage
|
|
nix run ./nix/test-cluster#cluster -- stop
|
|
nix run ./nix/test-cluster#cluster -- clean
|
|
```
|
|
|
|
## Validation Philosophy
|
|
|
|
- package unit tests are useful but not sufficient
|
|
- host-built VM clusters are the main integration signal
|
|
- bootstrap and rollout paths must stay evaluable independently of the larger VM-hosting feature set
|
|
- distributed storage and virtualization paths must be checked under failure, not only at steady state
|
|
|
|
## Legacy And Experimental Paths
|
|
|
|
- `baremetal/vm-cluster` manual launch scripts are `legacy/manual`, not canonical validation
|
|
- direct `nix develop ./nix/test-cluster -c ./nix/test-cluster/run-cluster.sh ...` usage is a debugging path, not the publishable entrypoint
|
|
- `netboot-control-plane`, `netboot-worker`, `netboot-all-in-one`, `netboot-base`, `pxe-server`, and other helper images are internal or experimental building blocks, not supported profiles by themselves
|