# PhotonCloud VM Test Cluster `nix/test-cluster` is the canonical local validation path for PhotonCloud. It boots six QEMU VMs, treats them as hardware-like nodes, and validates representative control-plane, worker, and gateway behavior over SSH and service endpoints. All VM images are built on the host in a single Nix invocation and then booted as prebuilt artifacts. The guests do not compile the stack locally. ## What it validates - 3-node control-plane formation for `chainfire`, `flaredb`, and `iam` - control-plane service health for `prismnet`, `flashdns`, `fiberlb`, `plasmavmc`, `lightningstor`, and `k8shost` - worker-node `plasmavmc` and `lightningstor` startup - PrismNet port binding for PlasmaVMC guests, including lifecycle cleanup on VM deletion - nested KVM inside worker VMs by booting an inner guest with `qemu-system-x86_64 -accel kvm` - gateway-node `apigateway`, `nightlight`, and minimal `creditservice` startup - host-forwarded access to the API gateway and NightLight HTTP surfaces - cross-node data replication smoke tests for `chainfire` and `flaredb` ## Validation layers - image build: build all six VM derivations on the host in one `nix build` - boot and unit readiness: boot the nodes in dependency order and wait for SSH plus the expected `systemd` units - protocol surfaces: probe the expected HTTP, TCP, UDP, and metrics endpoints for each role - replicated state: write and read convergence checks across the 3-node `chainfire` and `flaredb` clusters - worker virtualization: launch a nested KVM guest inside both worker VMs - external entrypoints: verify host-forwarded API gateway and NightLight access from outside the guest - auth-integrated minimal services: confirm `creditservice` stays up and actually connects to IAM ## Requirements - minimal host requirements: - Linux host with `/dev/kvm` - nested virtualization enabled on the host hypervisor - `nix` - if you do not use `nix run` or `nix develop`, install: - `qemu-system-x86_64` - `ssh` - `sshpass` - `curl` ## Main commands ```bash nix run ./nix/test-cluster#cluster -- build nix run ./nix/test-cluster#cluster -- start nix run ./nix/test-cluster#cluster -- smoke nix run ./nix/test-cluster#cluster -- fresh-smoke nix run ./nix/test-cluster#cluster -- matrix nix run ./nix/test-cluster#cluster -- fresh-matrix nix run ./nix/test-cluster#cluster -- bench-storage nix run ./nix/test-cluster#cluster -- fresh-bench-storage nix run ./nix/test-cluster#cluster -- validate nix run ./nix/test-cluster#cluster -- status nix run ./nix/test-cluster#cluster -- ssh node04 nix run ./nix/test-cluster#cluster -- stop nix run ./nix/test-cluster#cluster -- clean make cluster-smoke ``` Preferred entrypoint for publishable verification: `nix run ./nix/test-cluster#cluster -- fresh-smoke` `make cluster-smoke` is a convenience wrapper for the same clean host-build VM validation flow. `nix run ./nix/test-cluster#cluster -- matrix` reuses the current running cluster to exercise composed service scenarios such as `prismnet + flashdns + fiberlb`, PrismNet-backed VM hosting with `plasmavmc + prismnet + coronafs + lightningstor`, the Kubernetes-style hosting bundle, and API-gateway-mediated `nightlight` / `creditservice` flows. Preferred entrypoint for publishable matrix verification: `nix run ./nix/test-cluster#cluster -- fresh-matrix` `nix run ./nix/test-cluster#cluster -- bench-storage` benchmarks CoronaFS controller-export vs node-local-export I/O, worker-side materialization latency, and LightningStor large/small-object S3 throughput, then writes a report to `docs/storage-benchmarks.md`. Preferred entrypoint for publishable storage numbers: `nix run ./nix/test-cluster#cluster -- fresh-storage-bench` `nix run ./nix/test-cluster#cluster -- bench-coronafs-local-matrix` runs the local single-process CoronaFS export benchmark across the supported `cache`/`aio` combinations so software-path regressions can be separated from VM-lab network limits. On the current lab hosts, `cache=none` with `aio=io_uring` is the strongest local-export profile and should be treated as the reference point when CoronaFS remote numbers are being distorted by the nested-QEMU/VDE network path. ## Advanced usage Use the script entrypoint only for local debugging inside a prepared Nix shell: ```bash nix develop ./nix/test-cluster -c ./nix/test-cluster/run-cluster.sh smoke ``` For the strongest local check, use: ```bash nix develop ./nix/test-cluster -c ./nix/test-cluster/run-cluster.sh fresh-smoke ``` ## Runtime state The harness stores build links and VM runtime state under `${PHOTON_VM_DIR:-$HOME/.photoncloud-test-cluster}` for the default profile and uses profile-suffixed siblings such as `${PHOTON_VM_DIR:-$HOME/.photoncloud-test-cluster}-storage` for alternate build profiles. Logs for each VM are written to `//vm.log`. ## Scope note This harness is intentionally VM-first. Older ad hoc launch scripts under `baremetal/vm-cluster` are legacy/manual paths and should not be treated as the primary local validation entrypoint.