# Nix-NOS Simplification Plan (2026-04-04) ## Summary `nix-nos` should not remain a second cluster authoring surface. Status update: - `ultracloud.cluster` is now the only in-repo cluster authoring path - `services.first-boot-automation` no longer has a `useNixNOS` mode - root `flake.nix` no longer imports `nix-nos` - topology-specific `nix-nos` files have been removed - the remaining `nix-nos` tree is only network/BGP/routing primitives The right plan is: - keep `ultracloud.cluster` as the only cluster source of truth - keep `nix-nos` only as a compatibility facade for older topology-driven flows - eventually shrink `nix-nos` down to network primitives, or remove it entirely if those primitives are moved into the main Nix module tree ## Current State Today the repo is already halfway through this transition. - `nix/lib/cluster-schema.nix` is the actual schema/helper library - `nix/modules/ultracloud-cluster.nix` generates: - per-node `cluster-config.json` - `nix-nos.clusters` - deployer cluster state - `nix-nos/modules/topology.nix` no longer owns its own schema logic; it delegates to `cluster-schema.nix` - `services.first-boot-automation` still has a `useNixNOS` path and still treats `nix-nos.generateClusterConfig` as a real config source So the duplication is smaller than before, but the user-facing model is still confusing because there are still two apparent ways to describe a cluster. ## Recommendation The recommended target is: 1. `ultracloud.cluster` is the only supported cluster authoring API. 2. `nix-nos` is explicitly legacy-compatibility only for topology consumers that have not been migrated yet. 3. `nix-nos` should stop presenting itself as a general cluster definition layer. 4. `first-boot-automation` should stop depending on `nix-nos` as a primary provider. This keeps the repo simpler without forcing a big-bang removal. ## What Nix-NOS Should Still Own Only keep the parts that are actually distinct: - interface/VLAN primitives - BGP primitives - static routing primitives - any truly reusable NOS-style networking submodules These are valid low-level modules. What `nix-nos` should not own anymore: - whole-cluster source of truth - bootstrap node selection rules - cluster-config generation semantics - host inventory / deployer state generation Those belong in `ultracloud.cluster` and `cluster-schema.nix`. ## Target Shape ### Primary path - user writes `ultracloud.cluster` - `cluster-schema.nix` derives: - node cluster config - deployer cluster state - compatibility topology objects if needed ### Compatibility path - `nix-nos` may still expose `clusters` and `generateClusterConfig` - but they are documented and warned as legacy compatibility only - ideally they become thin read-only views over `cluster-schema.nix`, not an authoring API ### First boot `services.first-boot-automation` should eventually have only these modes: - use generated UltraCloud cluster config - use an explicit file path It should not need a separate `useNixNOS` mode long-term. ## Migration Plan ### Phase 1: Freeze - do not add new functionality to `nix-nos.clusters` - mark `nix-nos` topology usage as legacy in warnings/docs - keep all schema changes in `cluster-schema.nix` ### Phase 2: Move first-boot off Nix-NOS - make `services.first-boot-automation` prefer `ultracloud.cluster.generated.nodeClusterConfig` - keep `nix-nos` only as fallback/compat, not as the preferred path - stop using `useNixNOS` in normal tests/configurations ### Phase 3: Remove topology authoring role - deprecate direct authoring of `nix-nos.clusters` - remove `nix/modules/nix-nos/cluster-config-generator.nix` - collapse any remaining direct topology generation onto `cluster-schema.nix` ### Phase 4: Decide final fate Choose one: - keep `nix-nos` as a small network-primitives library - or move those network primitives under `nix/modules/network/*` and delete `nix-nos` The first option is lower risk. The second is cleaner. ## Recommended Decision Recommended decision: - short term: keep `nix-nos`, but only as a compatibility/network-primitives layer - medium term: remove `nix-nos` as a cluster authoring concept - long term: either rename/rehome the remaining network modules, or delete `nix-nos` if nothing substantial remains ## Immediate Next Steps 1. Mark `nix-nos.clusters` and `services.first-boot-automation.useNixNOS` as legacy in evaluation warnings. 2. Reduce test usage so only one compatibility smoke test still exercises direct `nix-nos` authoring. 3. Change docs/examples to author clusters through `ultracloud.cluster` only. 4. After that, remove the standalone `cluster-config-generator.nix` path.