571 lines
13 KiB
Markdown
571 lines
13 KiB
Markdown
# Bare Metal / MaaS-like Simplification Plan (2026-04-04)
|
|
|
|
## Summary
|
|
|
|
UltraCloud already has many of the right building blocks:
|
|
|
|
- `Nix` modules and flake outputs for host configuration
|
|
- `deployer` for bootstrap, enrollment, and inventory
|
|
- `nix-agent` for host OS reconciliation
|
|
- `fleet-scheduler` and `node-agent` for native service placement/runtime
|
|
|
|
The problem is not "missing everything". The problem is that the boundaries are still muddy:
|
|
|
|
- source of truth is duplicated
|
|
- install-time and runtime configuration are mixed together
|
|
- registration, inventory, credential issuance, and install-plan rendering are coupled
|
|
- bootstrap and scheduling are conceptually separate but still feel entangled in the repo
|
|
|
|
This document proposes a simpler target architecture for bare metal and MaaS-like provisioning, based on both the current repo and patterns used by existing systems.
|
|
|
|
## What Existing Systems Consistently Separate
|
|
|
|
### MAAS
|
|
|
|
Useful pattern:
|
|
|
|
- machine lifecycle is explicit
|
|
- commissioning, testing, deployment, release, rescue, and broken states are operator-visible
|
|
- registration/inventory is not the same thing as workload placement
|
|
|
|
Relevant docs:
|
|
|
|
- https://discourse.maas.io/t/about-maas/5511
|
|
- https://discourse.maas.io/t/machines-do-the-heavy-lifting/5080
|
|
|
|
### Ironic and Metal3
|
|
|
|
Useful pattern:
|
|
|
|
- enrollment, manageable, available, deploy, clean, rescue are explicit provisioning states
|
|
- inspection and cleaning are first-class lifecycle steps
|
|
- root device selection is modeled explicitly instead of relying on `/dev/sdX`
|
|
|
|
Relevant docs:
|
|
|
|
- https://docs.openstack.org/ironic/latest/install/enrollment.html
|
|
- https://book.metal3.io/bmo/automated_cleaning
|
|
- https://book.metal3.io/bmo/root_device_hints
|
|
|
|
### Tinkerbell
|
|
|
|
Useful pattern:
|
|
|
|
- hardware inventory, workflow/template, and install worker are separate concepts
|
|
- the installer environment is generic
|
|
- the workflow engine is distinct from hardware registration
|
|
|
|
Relevant docs:
|
|
|
|
- https://tinkerbell.org/docs/services/tink-worker/
|
|
- https://tinkerbell.org/docs/v0.22/services/tink-controller/
|
|
|
|
### Talos and Omni
|
|
|
|
Useful pattern:
|
|
|
|
- a minimal boot medium is used only to join management
|
|
- machine classes and labels drive config selection
|
|
- machine configuration is acquired over an API instead of being hard-coded into per-node install media
|
|
|
|
Relevant docs:
|
|
|
|
- https://omni.siderolabs.com/how-to-guides/registering-machines
|
|
- https://docs.siderolabs.com/talos/v1.10/overview/what-is-talos
|
|
|
|
### NixOS deployment tools
|
|
|
|
Useful pattern:
|
|
|
|
- installation and host rollout are separate concerns
|
|
- unattended install should be repeatable from declarative config
|
|
- activation needs timeout, health gate, and rollback semantics
|
|
|
|
Relevant docs:
|
|
|
|
- https://github.com/nix-community/nixos-anywhere
|
|
- https://github.com/serokell/deploy-rs
|
|
- https://colmena.cli.rs/0.4/reference/cli.html
|
|
|
|
## Current UltraCloud Pain Points
|
|
|
|
### 1. Source of truth is duplicated
|
|
|
|
Today the repo has overlapping schema and generation paths:
|
|
|
|
- `ultracloud.cluster` generates per-node cluster config, `nix-nos` topology, and deployer cluster state
|
|
- `nix-nos` still has its own cluster schema and `generateClusterConfig`
|
|
- `deployer-types::ClusterStateSpec` is another whole-cluster model on the Rust side
|
|
|
|
This makes it too easy to author the same concept twice.
|
|
|
|
Current references:
|
|
|
|
- `nix/modules/ultracloud-cluster.nix`
|
|
- `nix-nos/modules/topology.nix`
|
|
- `nix-nos/lib/cluster-config-lib.nix`
|
|
- `deployer/crates/deployer-types/src/lib.rs`
|
|
|
|
### 2. Install-time and runtime configuration are mixed
|
|
|
|
`NodeConfig` currently contains all of the following:
|
|
|
|
- hostname and IP
|
|
- labels, pool, node class
|
|
- services
|
|
- Nix profile
|
|
- install plan
|
|
|
|
That is too much for a single object. A bootstrap/install contract should not be the same object as runtime scheduling hints.
|
|
|
|
Current references:
|
|
|
|
- `deployer/crates/deployer-types/src/lib.rs`
|
|
- `deployer/crates/deployer-server/src/phone_home.rs`
|
|
|
|
### 3. `phone_home` is carrying too much responsibility
|
|
|
|
The current flow combines:
|
|
|
|
- machine identity lookup
|
|
- enrollment-rule matching
|
|
- node assignment
|
|
- inventory summarization
|
|
- cluster node record persistence
|
|
- SSH/TLS issuance
|
|
- install-plan return
|
|
|
|
This works, but it is difficult to reason about and difficult to evolve.
|
|
|
|
### 4. ISO bootstrap is still node-path oriented
|
|
|
|
The generic ISO still falls back to node-specific paths like:
|
|
|
|
- `nix/nodes/vm-cluster/$NODE_ID/disko.nix`
|
|
|
|
That prevents profile/class-based provisioning from becoming the main path.
|
|
|
|
### 5. Host rollout and runtime scheduling are separated in code but not in the mental model
|
|
|
|
The repo already has:
|
|
|
|
- `nix-agent` for host OS state
|
|
- host deployment reconciliation for writing `desired-system`
|
|
- `fleet-scheduler` for native service placement
|
|
- `node-agent` for process/container reconcile
|
|
|
|
These are the right components, but the naming and schema boundaries do not make the split obvious.
|
|
|
|
## Design Goal
|
|
|
|
The simplest viable target is not "build all of MAAS".
|
|
|
|
The simplest viable target is:
|
|
|
|
1. `Nix` is the only authoring surface for static cluster intent.
|
|
2. bootstrap deals only with discovery, assignment, credentials, and install plans.
|
|
3. host rollout is a separate controller/agent path.
|
|
4. service scheduling is entirely downstream of host rollout.
|
|
5. BMC and PXE are optional extensions, not required for the base design.
|
|
|
|
For your current 6-machine, no-BMC environment, this is the right scope.
|
|
|
|
## Proposed Target Model
|
|
|
|
### Layer 1: Static model in Nix
|
|
|
|
Create a single Nix library as the canonical schema. Do not create a fourth schema; promote the existing `cluster-config-lib` into the canonical one.
|
|
|
|
Recommended file:
|
|
|
|
- `nix/lib/cluster-schema.nix`
|
|
|
|
Practical migration:
|
|
|
|
- move or copy `nix-nos/lib/cluster-config-lib.nix` to `nix/lib/cluster-schema.nix`
|
|
- make `ultracloud-cluster.nix` and `nix-nos/modules/topology.nix` thin wrappers over it
|
|
- stop adding new schema logic anywhere else
|
|
|
|
This library should define only stable declarative objects:
|
|
|
|
- cluster
|
|
- networks
|
|
- install profiles
|
|
- disk policies
|
|
- node classes
|
|
- pools
|
|
- enrollment rules
|
|
- nodes
|
|
- host deployments
|
|
- service policies
|
|
|
|
From that one schema, generate these artifacts:
|
|
|
|
- `nixosConfigurations.<node>`
|
|
- bootstrap install-plan data
|
|
- deployer cluster-state JSON
|
|
- test-cluster topology
|
|
|
|
## Recommended Nix Object Split
|
|
|
|
### `installProfiles`
|
|
|
|
Purpose:
|
|
|
|
- reusable OS install targets
|
|
- used during discovery/bootstrap
|
|
|
|
Fields:
|
|
|
|
- flake attribute or system profile reference
|
|
- disk policy reference
|
|
- network policy reference
|
|
- bootstrap package set / image bundle reference
|
|
|
|
### `diskPolicies`
|
|
|
|
Purpose:
|
|
|
|
- stable root-disk selection
|
|
- avoid hardcoding `/dev/sda` or node-specific Disko paths
|
|
|
|
Fields:
|
|
|
|
- root device hints
|
|
- partition layout
|
|
- wipe/cleaning policy
|
|
|
|
Borrowed directly from Ironic/Metal3 thinking: disk choice must be modeled, not guessed.
|
|
|
|
### `nodeClasses`
|
|
|
|
Purpose:
|
|
|
|
- describe intended hardware/software role
|
|
|
|
Fields:
|
|
|
|
- install profile
|
|
- default labels
|
|
- runtime capabilities
|
|
- minimum hardware traits
|
|
|
|
### `enrollmentRules`
|
|
|
|
Purpose:
|
|
|
|
- match discovered machines to class/pool/labels
|
|
|
|
Fields:
|
|
|
|
- selectors on machine-id, MAC, DMI, disk traits, NIC traits
|
|
- assigned node class
|
|
- assigned pool
|
|
- optional hostname/node-id policy
|
|
|
|
### `nodes`
|
|
|
|
Purpose:
|
|
|
|
- explicit identity for fixed nodes when you want them
|
|
|
|
Use this for:
|
|
|
|
- control plane seeds
|
|
- gateways
|
|
- special hardware
|
|
|
|
Do not require this for every worker in the generic path.
|
|
|
|
### `hostDeployments`
|
|
|
|
Purpose:
|
|
|
|
- rollout desired host OS state to already-installed machines
|
|
|
|
This is not bootstrap.
|
|
|
|
### `servicePolicies`
|
|
|
|
Purpose:
|
|
|
|
- runtime placement intent for `fleet-scheduler`
|
|
|
|
This is not host provisioning.
|
|
|
|
## Proposed Rust/API Object Split
|
|
|
|
Replace the current "fat" `NodeConfig` mental model with explicit smaller objects.
|
|
|
|
### `MachineInventory`
|
|
|
|
Owned by:
|
|
|
|
- bootstrap discovery
|
|
|
|
Contains:
|
|
|
|
- machine identity
|
|
- hardware facts
|
|
- last inventory hash
|
|
- boot method support
|
|
- optional power capability metadata
|
|
|
|
### `NodeAssignment`
|
|
|
|
Owned by:
|
|
|
|
- deployer enrollment logic
|
|
|
|
Contains:
|
|
|
|
- stable `node_id`
|
|
- hostname
|
|
- class
|
|
- pool
|
|
- labels
|
|
- failure domain
|
|
|
|
### `BootstrapSecrets`
|
|
|
|
Owned by:
|
|
|
|
- deployer credential issuer
|
|
|
|
Contains:
|
|
|
|
- SSH host key
|
|
- TLS cert/key
|
|
- bootstrap token or short-lived install token
|
|
|
|
### `InstallPlan`
|
|
|
|
Owned by:
|
|
|
|
- deployer plan renderer
|
|
|
|
Contains:
|
|
|
|
- node assignment reference
|
|
- install profile reference
|
|
- resolved flake attr or system reference
|
|
- resolved disk policy or root-device selection
|
|
- network bootstrap data
|
|
- image/bundle URL
|
|
|
|
### `DesiredSystem`
|
|
|
|
Owned by:
|
|
|
|
- host rollout controller
|
|
|
|
Contains:
|
|
|
|
- target system
|
|
- activation strategy
|
|
- health check
|
|
- rollback policy
|
|
|
|
### `ServiceSpec`
|
|
|
|
Owned by:
|
|
|
|
- runtime scheduler
|
|
|
|
Contains:
|
|
|
|
- service placement and instance policy only
|
|
|
|
It should not be returned by bootstrap APIs.
|
|
|
|
## Recommended Controller Split
|
|
|
|
### 1. Deployer server
|
|
|
|
Keep responsibility limited to:
|
|
|
|
- discovery
|
|
- enrollment / assignment
|
|
- inventory storage
|
|
- credential issuance
|
|
- install-plan rendering
|
|
|
|
Do not make it the host rollout engine and do not make it the runtime scheduler.
|
|
|
|
### 2. Host deployment controller
|
|
|
|
Make this an explicit first-class component. Today that logic exists in `ultracloud-reconciler hosts`.
|
|
|
|
Responsibility:
|
|
|
|
- watch `HostDeployment`
|
|
- select nodes
|
|
- write `desired-system`
|
|
- respect rollout budget and drain policy
|
|
|
|
Recommendation:
|
|
|
|
- rename it conceptually to `host-controller`
|
|
- keep it separate from `fleet-scheduler`
|
|
|
|
### 3. `nix-agent`
|
|
|
|
This should borrow deploy-rs style semantics:
|
|
|
|
- activation timeout
|
|
- confirmation/health gate
|
|
- rollback on failure
|
|
- staged reboot handling
|
|
|
|
### 4. `fleet-scheduler`
|
|
|
|
Responsibility:
|
|
|
|
- service placement only
|
|
|
|
Do not allow bootstrap/install concerns to leak here.
|
|
|
|
## Recommended Bootstrap Flow
|
|
|
|
Keep one generic installer image, but make the protocol explicit.
|
|
|
|
### Step 1: discover
|
|
|
|
Installer boots and sends:
|
|
|
|
- machine identity
|
|
- hardware facts
|
|
- observed network facts
|
|
|
|
### Step 2: assign
|
|
|
|
Deployer resolves:
|
|
|
|
- class
|
|
- pool
|
|
- hostname/node-id
|
|
- install profile
|
|
|
|
### Step 3: fetch plan
|
|
|
|
Installer receives:
|
|
|
|
- `NodeAssignment`
|
|
- `BootstrapSecrets`
|
|
- `InstallPlan`
|
|
|
|
### Step 4: install
|
|
|
|
Installer:
|
|
|
|
- fetches source bundle
|
|
- resolves disk policy
|
|
- runs Disko
|
|
- installs NixOS
|
|
- reports status
|
|
|
|
### Step 5: first boot
|
|
|
|
Installed system starts:
|
|
|
|
- core static services
|
|
- `nix-agent`
|
|
- runtime agent only if needed for that class
|
|
|
|
This is closer to Tinkerbell and Talos than to the current monolithic `node_config` flow, while remaining much smaller than MAAS or Ironic.
|
|
|
|
## Recommended Lifecycle State Model
|
|
|
|
Adopt a visible state machine. At minimum:
|
|
|
|
- `discovered`
|
|
- `inspected`
|
|
- `commissioned`
|
|
- `install-pending`
|
|
- `installing`
|
|
- `installed`
|
|
- `active`
|
|
- `draining`
|
|
- `reprovisioning`
|
|
- `rescue`
|
|
- `failed`
|
|
|
|
Keep these orthogonal to:
|
|
|
|
- power state
|
|
- host rollout state
|
|
- runtime service health
|
|
|
|
This separation is important. MAAS and Ironic both benefit from not collapsing every concern into one state field.
|
|
|
|
## Concrete Repo Changes Recommended
|
|
|
|
### Phase A: schema simplification
|
|
|
|
1. Promote `nix-nos/lib/cluster-config-lib.nix` into `nix/lib/cluster-schema.nix`.
|
|
2. Remove duplicated schema logic from `nix-nos/modules/topology.nix`.
|
|
3. Keep `ultracloud-cluster.nix` as an exporter/generator module, not a second schema definition.
|
|
|
|
### Phase B: bootstrap contract simplification
|
|
|
|
1. Deprecate `NodeConfig` as the primary bootstrap payload.
|
|
2. Introduce separate Rust types for:
|
|
- assignment
|
|
- bootstrap secrets
|
|
- install plan
|
|
3. Keep `phone_home` endpoint if desired, but split the implementation internally into separate phases/functions.
|
|
|
|
### Phase C: installer simplification
|
|
|
|
1. Remove node-specific fallback logic from `nix/iso/ultracloud-iso.nix`.
|
|
2. Require a resolved install profile or disk policy in the returned install plan.
|
|
3. Resolve disk targets using stable hints or explicit by-id paths.
|
|
|
|
### Phase D: controller clarification
|
|
|
|
1. Make the host rollout controller a named subsystem.
|
|
2. Document `nix-agent` as host OS reconcile only.
|
|
3. Document `fleet-scheduler` and `node-agent` as runtime-only.
|
|
|
|
### Phase E: operator UX
|
|
|
|
1. Add an inventory/commission view to `deployer-ctl`.
|
|
2. Make lifecycle transitions explicit.
|
|
3. Add reinstall/rescue flows that work even without BMC.
|
|
|
|
## What Not To Build Yet
|
|
|
|
Do not start with:
|
|
|
|
- a full MAAS clone
|
|
- full Ironic feature parity
|
|
- mandatory PXE
|
|
- mandatory BMC
|
|
- scheduler-driven bootstrap for all control-plane services
|
|
|
|
For the current environment, that would add complexity faster than value.
|
|
|
|
## Smallest Useful End State For The 6-PC Lab
|
|
|
|
The smallest useful design is:
|
|
|
|
- one generic ISO
|
|
- hardware discovery
|
|
- rule-based assignment to class/pool/profile
|
|
- explicit install plan
|
|
- stable disk policy
|
|
- first-boot `nix-agent`
|
|
- host rollout separate from runtime service scheduling
|
|
|
|
That gives you a MaaS-like system for real hardware without forcing MAAS-scale complexity into the repo.
|
|
|
|
## Immediate Next Design Tasks
|
|
|
|
1. Write `nix/lib/cluster-schema.nix` by extracting and renaming the existing cluster library.
|
|
2. Redesign the Rust bootstrap payloads around `NodeAssignment`, `BootstrapSecrets`, and `InstallPlan`.
|
|
3. Update the ISO to consume only the new install-plan contract.
|
|
4. Write a short architecture doc that shows the four control loops:
|
|
- discovery/enrollment
|
|
- installation
|
|
- host rollout
|
|
- runtime scheduling
|