lightscale/TODO.md

81 lines
3.7 KiB
Markdown

# Lightscale TODO
This list captures every feature discussed in the design conversation and tracks the current
implementation status. Items marked as done are implemented in a minimal form; the rest are
pending or stubbed.
## Control plane
- [x] Network creation + enrollment tokens
- [x] Deterministic overlay prefixes (IPv4/IPv6)
- [x] Netmap with node/peer metadata
- [x] Relay config surfaced in netmap (STUN/TURN/stream relay/UDP relay lists)
- [x] Device naming + DNS name metadata in netmap
- [x] Admin approval flow for new devices
- [x] Auth URL onboarding flow (approval link)
- [x] Token revocation endpoint
- [x] ACL / policy rules per network
- [x] Key rotation policy and revocation
- [x] Key transparency / audit log
- [x] TLS pinning / server identity bootstrapping
- [x] Control plane HA with shared DB (multi-server, no single point of failure)
- [x] Client failover across multiple control URLs
## Data plane
- [x] WireGuard interface bring-up (Linux)
- [x] Userspace WireGuard fallback (boringtun embedded)
- [x] Peer config from netmap (AllowedIPs per peer)
- [x] Basic route application (`wg-up --apply-routes`)
- [x] Basic exit-node acceptance flag (first advertised default route)
- [x] Dynamic peer updates (long-polling or streaming)
- [x] Subnet routing conflict detection and metrics
- [x] Exit node selection and routing policy (multi-exit, per-app, metrics)
- [x] Exit node selection by peer ID/name (single)
- [x] DNS push / resolver integration
- [x] L2 segment support (optional, non-default)
- [x] Subnet router SNAT mode + return-route guidance
## NAT traversal and relay
- [x] STUN client to discover public endpoints (best effort)
- [x] Server-observed endpoint merge (heartbeat listen port)
- [x] Peer probe to trigger NAT traversal (best effort)
- [x] Dynamic endpoint rotation on stale handshakes
- [x] UDP relay (best effort, not TURN/stream relay)
- [x] Stream relay (TCP, DERP-like)
- [x] Stream relay signaling for peer probes
- [x] TURN client (UDP)
- [x] Stream relay integration into dataplane (fallback tunnels)
- [x] IPv6-only server strategy (use IPv6 listen + IPv6 control URLs)
## Multi-network (one client, multiple networks)
- [x] Profile-scoped client state
- [x] Concurrent multi-network routing isolation
- [x] Default no-forwarding between networks (prevent accidental routing)
- [x] Route conflict detection for overlapping subnets
- [x] Route translation / prefix mapping for overlaps
- [x] Per-network DNS suffix + split DNS
- [x] Exit node selection when multiple networks advertise default routes
## Onboarding and UX
- [x] CLI init/register/heartbeat/netmap/status
- [x] CLI wg-up/wg-down for local testing
- [x] CLI dns export and relay inspection
- [x] CLI to approve devices
- [x] CLI to manage ACLs
- [x] Local UI / agent mode for background operation
## Testing
- [x] NixOS VM lab (2-node fast)
- [x] NixOS VM lab (5-node)
- [x] NAT and firewall scenario tests
- [x] Relay fallback tests
- [x] Multi-network overlap tests
- [x] CLI smoke assertions for `status`/`netmap`/`relay` outputs in lab tests
- [x] Negative-path enrollment tests (invalid/expired/revoked tokens, approval-required flow)
- [x] Key rotation + revoke flow validation (netmap status + peer removal)
- [x] DNS export + resolver integration tests (hosts/json, split DNS)
- [x] IPv6 dataplane connectivity tests (overlay + subnet routes)
- [x] Userspace endpoint refresh/relay fallback regression test coverage
- [x] Scale lab test (8-10 nodes) to validate full-mesh and relay fallback
- [x] Agent restart + state recovery tests (graceful restart, endpoint re-discovery)
- [x] Control plane restart/outage resilience (data plane stays up, netmap recovers)
- [x] Multi-relay server failover (stream relay list, first down)