| .. | ||
| common.nix | ||
| flake.lock | ||
| flake.nix | ||
| nightlight_remote_write.py | ||
| node01.nix | ||
| node02.nix | ||
| node03.nix | ||
| node04.nix | ||
| node05.nix | ||
| node06.nix | ||
| README.md | ||
| run-cluster.sh | ||
| storage-node01.nix | ||
| storage-node02.nix | ||
| storage-node03.nix | ||
| storage-node04.nix | ||
| storage-node05.nix | ||
| vm-bench-guest-image.nix | ||
| vm-guest-image.nix | ||
PhotonCloud VM Test Cluster
nix/test-cluster is the canonical local validation path for PhotonCloud.
It boots six QEMU VMs, treats them as hardware-like nodes, and validates representative control-plane, worker, and gateway behavior over SSH and service endpoints.
All VM images are built on the host in a single Nix invocation and then booted as prebuilt artifacts. The guests do not compile the stack locally.
What it validates
- 3-node control-plane formation for
chainfire,flaredb, andiam - control-plane service health for
prismnet,flashdns,fiberlb,plasmavmc,lightningstor, andk8shost - worker-node
plasmavmcandlightningstorstartup - PrismNet port binding for PlasmaVMC guests, including lifecycle cleanup on VM deletion
- nested KVM inside worker VMs by booting an inner guest with
qemu-system-x86_64 -accel kvm - gateway-node
apigateway,nightlight, and minimalcreditservicestartup - host-forwarded access to the API gateway and NightLight HTTP surfaces
- cross-node data replication smoke tests for
chainfireandflaredb
Validation layers
- image build: build all six VM derivations on the host in one
nix build - boot and unit readiness: boot the nodes in dependency order and wait for SSH plus the expected
systemdunits - protocol surfaces: probe the expected HTTP, TCP, UDP, and metrics endpoints for each role
- replicated state: write and read convergence checks across the 3-node
chainfireandflaredbclusters - worker virtualization: launch a nested KVM guest inside both worker VMs
- external entrypoints: verify host-forwarded API gateway and NightLight access from outside the guest
- auth-integrated minimal services: confirm
creditservicestays up and actually connects to IAM
Requirements
- minimal host requirements:
- Linux host with
/dev/kvm - nested virtualization enabled on the host hypervisor
nix
- Linux host with
- if you do not use
nix runornix develop, install:qemu-system-x86_64sshsshpasscurl
Main commands
nix run ./nix/test-cluster#cluster -- build
nix run ./nix/test-cluster#cluster -- start
nix run ./nix/test-cluster#cluster -- smoke
nix run ./nix/test-cluster#cluster -- fresh-smoke
nix run ./nix/test-cluster#cluster -- matrix
nix run ./nix/test-cluster#cluster -- fresh-matrix
nix run ./nix/test-cluster#cluster -- bench-storage
nix run ./nix/test-cluster#cluster -- fresh-bench-storage
nix run ./nix/test-cluster#cluster -- validate
nix run ./nix/test-cluster#cluster -- status
nix run ./nix/test-cluster#cluster -- ssh node04
nix run ./nix/test-cluster#cluster -- stop
nix run ./nix/test-cluster#cluster -- clean
make cluster-smoke
Preferred entrypoint for publishable verification: nix run ./nix/test-cluster#cluster -- fresh-smoke
make cluster-smoke is a convenience wrapper for the same clean host-build VM validation flow.
nix run ./nix/test-cluster#cluster -- matrix reuses the current running cluster to exercise composed service scenarios such as prismnet + flashdns + fiberlb, PrismNet-backed VM hosting with plasmavmc + prismnet + coronafs + lightningstor, the Kubernetes-style hosting bundle, and API-gateway-mediated nightlight / creditservice flows.
Preferred entrypoint for publishable matrix verification: nix run ./nix/test-cluster#cluster -- fresh-matrix
nix run ./nix/test-cluster#cluster -- bench-storage benchmarks CoronaFS controller-export vs node-local-export I/O, worker-side materialization latency, and LightningStor large/small-object S3 throughput, then writes a report to docs/storage-benchmarks.md.
Preferred entrypoint for publishable storage numbers: nix run ./nix/test-cluster#cluster -- fresh-storage-bench
nix run ./nix/test-cluster#cluster -- bench-coronafs-local-matrix runs the local single-process CoronaFS export benchmark across the supported cache/aio combinations so software-path regressions can be separated from VM-lab network limits.
On the current lab hosts, cache=none with aio=io_uring is the strongest local-export profile and should be treated as the reference point when CoronaFS remote numbers are being distorted by the nested-QEMU/VDE network path.
Advanced usage
Use the script entrypoint only for local debugging inside a prepared Nix shell:
nix develop ./nix/test-cluster -c ./nix/test-cluster/run-cluster.sh smoke
For the strongest local check, use:
nix develop ./nix/test-cluster -c ./nix/test-cluster/run-cluster.sh fresh-smoke
Runtime state
The harness stores build links and VM runtime state under ${PHOTON_VM_DIR:-$HOME/.photoncloud-test-cluster} for the default profile and uses profile-suffixed siblings such as ${PHOTON_VM_DIR:-$HOME/.photoncloud-test-cluster}-storage for alternate build profiles.
Logs for each VM are written to <state-dir>/<node>/vm.log.
Scope note
This harness is intentionally VM-first. Older ad hoc launch scripts under baremetal/vm-cluster are legacy/manual paths and should not be treated as the primary local validation entrypoint.