|
Some checks failed
KVM Publishable Validation / publishable-kvm-suite (push) Failing after 5s
|
||
|---|---|---|
| .cargo | ||
| .github/workflows | ||
| apigateway | ||
| baremetal | ||
| bin | ||
| chainfire | ||
| client-common | ||
| coronafs | ||
| crates | ||
| creditservice | ||
| deployer | ||
| docs | ||
| fiberlb | ||
| flaredb | ||
| flashdns | ||
| iam | ||
| k8shost | ||
| lightningstor | ||
| mtls-agent | ||
| nightlight | ||
| nix | ||
| nix-nos | ||
| plans | ||
| plasmavmc | ||
| prismnet | ||
| scripts | ||
| .gitignore | ||
| CONTRIBUTING.md | ||
| flake.lock | ||
| flake.nix | ||
| LICENSE | ||
| Makefile | ||
| README.md | ||
| SECURITY.md | ||
| shell.nix | ||
UltraCloud
UltraCloud is a Nix-first cloud platform workspace that assembles a small control plane, network services, VM hosting, shared storage, object storage, and gateway services into one reproducible repository.
The fastest public entrypoint is the one-command single-node quickstart. The canonical multi-node integration proof remains the six-node VM cluster under nix/test-cluster, which builds all guest images on the host, boots them as hardware-like QEMU nodes, and validates real multi-node behavior.
The canonical bare-metal bootstrap proof is the ISO-on-QEMU path under nix/test-cluster, which drives phone-home, Disko install, reboot, and desired-system convergence for one control-plane node and one worker-equivalent node.
Components
chainfire: replicated coordination storeflaredb: replicated KV and metadata storeiam: identity, token issuance, and authorizationprismnet: tenant networking control planeflashdns: authoritative DNS servicefiberlb: load balancer control plane and dataplaneplasmavmc: VM control plane and worker agentscoronafs: shared filesystem for mutable VM volumeslightningstor: object storage and VM image backingk8shost: Kubernetes-style hosting control plane for tenant pods and servicesapigateway: external API and proxy surfacenightlight: metrics ingestion and query servicecreditservice: minimal reference quota/credit servicedeployer: bootstrap and phone-home deployment service that owns install plans and desired-system intentfleet-scheduler: non-Kubernetes service scheduler for bare-metal cluster services
Quick Start
Single-node quickstart:
nix run .#single-node-quickstart
This app builds the minimal VM stack, boots a QEMU VM, waits for chainfire, flaredb, iam, prismnet, and plasmavmc, checks their health endpoints, and verifies the in-guest VM runtime prerequisites. For an interactive session, keep the VM running:
ULTRACLOUD_QUICKSTART_KEEP_VM=1 nix run .#single-node-quickstart
The legacy name .#all-in-one-quickstart is kept as an alias.
Portable local proof on hosts without /dev/kvm:
nix build .#checks.x86_64-linux.canonical-profile-eval-guards
nix build .#checks.x86_64-linux.portable-control-plane-regressions
This TCG-safe lane keeps canonical profile drift, the core chainfire / deployer control-plane path, the deployer -> nix-agent boundary, and the fleet-scheduler -> node-agent boundary under regression coverage without requiring nested virtualization.
Publishable nested-KVM suite:
nix develop
nix run ./nix/test-cluster#cluster -- fresh-smoke
nix run ./nix/test-cluster#cluster -- fresh-demo-vm-webapp
nix run ./nix/test-cluster#cluster -- fresh-matrix
./nix/test-cluster/run-publishable-kvm-suite.sh ./work/publishable-kvm-suite
Project-done release proof now requires both halves of the public validation surface to be green:
baremetal-isoandbaremetal-iso-e2efor the canonicaldeployer -> installer -> nix-agentbare-metal bootstrap path- the KVM publishable suite (
fresh-smoke,fresh-demo-vm-webapp,fresh-matrix) for the nested-KVM multi-node VM-hosting path
Canonical bare-metal bootstrap proof:
nix run ./nix/test-cluster#cluster -- baremetal-iso
nix build .#checks.x86_64-linux.baremetal-iso-e2e
Canonical Profiles
UltraCloud now fixes the public support surface to three canonical profiles:
| Profile | Primary Nix outputs | Required components | Optional components |
|---|---|---|---|
single-node dev |
nix run .#single-node-quickstart, nixosConfigurations.single-node-quickstart, companion install image nixosConfigurations.netboot-all-in-one |
chainfire, flaredb, iam, plasmavmc, prismnet |
lightningstor, coronafs, flashdns, fiberlb, apigateway, nightlight, creditservice, k8shost, deployer |
3-node HA control plane |
nixosConfigurations.node01, node02, node03, netboot-control-plane |
chainfire, flaredb, iam, nix-agent on every control-plane node, plus deployer on the bootstrap node |
fleet-scheduler, node-agent, prismnet, flashdns, fiberlb, plasmavmc, lightningstor, coronafs, k8shost, apigateway, nightlight, creditservice |
bare-metal bootstrap |
nixosConfigurations.ultracloud-iso, nixosConfigurations.baremetal-qemu-control-plane, nixosConfigurations.baremetal-qemu-worker, checks.x86_64-linux.baremetal-iso-e2e |
deployer, first-boot-automation, install-target, nix-agent |
netboot-control-plane, netboot-worker, and netboot-all-in-one as experimental helper images, plus node-agent, fleet-scheduler, and higher-level storage or edge services after bootstrap |
netboot-base is an internal helper image, not a public profile. netboot-control-plane, netboot-worker, and netboot-all-in-one remain experimental helper images until they implement the same phone-home and install semantics as the ISO path. Older launch flows under baremetal/vm-cluster are legacy/manual, not canonical.
Responsibility Boundaries
k8shostowns Kubernetes-style pod and service APIs for tenant workloads, then translates them intoprismnet,flashdns, andfiberlbobjects. It does not place host-native cluster daemons.fleet-schedulerowns placement and failover of host-native service instances from declarative cluster state. It consumesnode-agentheartbeats and writes instance placement, but it does not expose tenant-facing Kubernetes semantics.deployerowns machine enrollment,/api/v1/phone-home, install plans, cluster metadata, and desired-system references. It decides what a node should become, but it does not execute the host-local switch.nix-agentowns host-local NixOS convergence only. It reads desired-system state fromdeployerorchainfire, activates the target closure, and rolls back on failed health checks.node-agentowns host-local runtime execution only. It reports heartbeats and applies scheduled service-instance state, but it does not install the base OS or rewrite desired-system targets.
Main Entrypoints
- workspace flake: flake.nix
- single-node quickstart smoke:
nix run .#single-node-quickstart - portable local proof:
nix build .#checks.x86_64-linux.portable-control-plane-regressions - canonical bare-metal bootstrap smoke:
nix run ./nix/test-cluster#cluster -- baremetal-iso - canonical profile guards:
nix build .#checks.x86_64-linux.canonical-profile-eval-guards,nix build .#checks.x86_64-linux.canonical-profile-build-guards - VM validation harness: nix/test-cluster/README.md
- shared volume notes: coronafs/README.md
- minimal quota-service rationale: creditservice/README.md
- legacy/manual VM launch scripts: baremetal/vm-cluster/README.md
Repository Guide
- docs/README.md: documentation entrypoint
- docs/testing.md: validation path summary
- docs/component-matrix.md: canonical profiles and optional bundles
- docs/storage-benchmarks.md: latest CoronaFS and LightningStor lab numbers
plans/: design notes and exploration documents
Scope
UltraCloud is centered on reproducible infrastructure behavior rather than polished end-user product surfaces. Some services, such as creditservice, are intentionally minimal reference implementations that prove integration points rather than full products.
Host-level NixOS rollout validation is also expected to stay reproducible: baremetal-iso-e2e is now the full install-path proof, canonical-profile-eval-guards and canonical-profile-build-guards fail fast when supported outputs drift, and portable-control-plane-regressions is the non-KVM developer lane that keeps the main control-plane and rollout boundaries green on TCG-only hosts before the publishable nested-KVM suite is rerun.