4.6 KiB
Nix-NOS Simplification Plan (2026-04-04)
Summary
nix-nos should not remain a second cluster authoring surface.
Status update:
ultracloud.clusteris now the only in-repo cluster authoring pathservices.first-boot-automationno longer has auseNixNOSmode- root
flake.nixno longer importsnix-nos - topology-specific
nix-nosfiles have been removed - the remaining
nix-nostree is only network/BGP/routing primitives
The right plan is:
- keep
ultracloud.clusteras the only cluster source of truth - keep
nix-nosonly as a compatibility facade for older topology-driven flows - eventually shrink
nix-nosdown to network primitives, or remove it entirely if those primitives are moved into the main Nix module tree
Current State
Today the repo is already halfway through this transition.
nix/lib/cluster-schema.nixis the actual schema/helper librarynix/modules/ultracloud-cluster.nixgenerates:- per-node
cluster-config.json nix-nos.clusters- deployer cluster state
- per-node
nix-nos/modules/topology.nixno longer owns its own schema logic; it delegates tocluster-schema.nixservices.first-boot-automationstill has auseNixNOSpath and still treatsnix-nos.generateClusterConfigas a real config source
So the duplication is smaller than before, but the user-facing model is still confusing because there are still two apparent ways to describe a cluster.
Recommendation
The recommended target is:
ultracloud.clusteris the only supported cluster authoring API.nix-nosis explicitly legacy-compatibility only for topology consumers that have not been migrated yet.nix-nosshould stop presenting itself as a general cluster definition layer.first-boot-automationshould stop depending onnix-nosas a primary provider.
This keeps the repo simpler without forcing a big-bang removal.
What Nix-NOS Should Still Own
Only keep the parts that are actually distinct:
- interface/VLAN primitives
- BGP primitives
- static routing primitives
- any truly reusable NOS-style networking submodules
These are valid low-level modules.
What nix-nos should not own anymore:
- whole-cluster source of truth
- bootstrap node selection rules
- cluster-config generation semantics
- host inventory / deployer state generation
Those belong in ultracloud.cluster and cluster-schema.nix.
Target Shape
Primary path
- user writes
ultracloud.cluster cluster-schema.nixderives:- node cluster config
- deployer cluster state
- compatibility topology objects if needed
Compatibility path
nix-nosmay still exposeclustersandgenerateClusterConfig- but they are documented and warned as legacy compatibility only
- ideally they become thin read-only views over
cluster-schema.nix, not an authoring API
First boot
services.first-boot-automation should eventually have only these modes:
- use generated UltraCloud cluster config
- use an explicit file path
It should not need a separate useNixNOS mode long-term.
Migration Plan
Phase 1: Freeze
- do not add new functionality to
nix-nos.clusters - mark
nix-nostopology usage as legacy in warnings/docs - keep all schema changes in
cluster-schema.nix
Phase 2: Move first-boot off Nix-NOS
- make
services.first-boot-automationpreferultracloud.cluster.generated.nodeClusterConfig - keep
nix-nosonly as fallback/compat, not as the preferred path - stop using
useNixNOSin normal tests/configurations
Phase 3: Remove topology authoring role
- deprecate direct authoring of
nix-nos.clusters - remove
nix/modules/nix-nos/cluster-config-generator.nix - collapse any remaining direct topology generation onto
cluster-schema.nix
Phase 4: Decide final fate
Choose one:
- keep
nix-nosas a small network-primitives library - or move those network primitives under
nix/modules/network/*and deletenix-nos
The first option is lower risk. The second is cleaner.
Recommended Decision
Recommended decision:
- short term: keep
nix-nos, but only as a compatibility/network-primitives layer - medium term: remove
nix-nosas a cluster authoring concept - long term: either rename/rehome the remaining network modules, or delete
nix-nosif nothing substantial remains
Immediate Next Steps
- Mark
nix-nos.clustersandservices.first-boot-automation.useNixNOSas legacy in evaluation warnings. - Reduce test usage so only one compatibility smoke test still exercises direct
nix-nosauthoring. - Change docs/examples to author clusters through
ultracloud.clusteronly. - After that, remove the standalone
cluster-config-generator.nixpath.