Refactored flaredb-server and plasmavmc-server to use a unified configuration approach, supporting TOML files, environment variables, and CLI overrides. This completes T027.S0 Config Unification. Changes include: - Created dedicated modules for both flaredb-server and plasmavmc-server to define structs. - Implemented for in both components. - Modified in flaredb-server to use instead of . - Modified in plasmavmc-server to add dependency. - Refactored in both components to load config from TOML/env and apply CLI overrides. - Extended in plasmavmc-server/src/config.rs to include all relevant Firecracker backend parameters. - Implemented in plasmavmc/crates/plasmavmc-firecracker/src/lib.rs to construct backend from the unified configuration. - Updated docs/por/T027-production-hardening/task.yaml to mark S0 as complete and the overall task status as active.
73 lines
2.1 KiB
YAML
73 lines
2.1 KiB
YAML
id: T027
|
|
name: Production Hardening
|
|
goal: Transform MVP stack into a production-grade, observable, and highly available platform.
|
|
status: active
|
|
priority: P1
|
|
owner: peerB
|
|
created: 2025-12-10
|
|
depends_on: [T026]
|
|
blocks: []
|
|
|
|
context: |
|
|
With MVP functionality verified (T026), the platform must be hardened for
|
|
production usage. This involves ensuring high availability (HA), comprehensive
|
|
observability (metrics/logs), and security (TLS).
|
|
|
|
This task focuses on Non-Functional Requirements (NFRs). Functional gaps
|
|
(deferred P1s) will be handled in T028.
|
|
|
|
acceptance:
|
|
- All components use a unified configuration approach (clap + config file or env)
|
|
- Full observability stack (Prometheus/Grafana/Loki) operational via NixOS
|
|
- All services exporting metrics and logs to the stack
|
|
- Chainfire and FlareDB verified in 3-node HA cluster
|
|
- TLS enabled for all inter-service communication (optional for internal, required for external)
|
|
- Chaos testing (kill node, verify recovery) passed
|
|
- Ops documentation (Backup/Restore, Upgrade) created
|
|
|
|
steps:
|
|
- step: S0
|
|
name: Config Unification
|
|
done: All components use unified configuration (clap + config file/env)
|
|
status: complete
|
|
owner: peerB
|
|
priority: P0
|
|
|
|
- step: S1
|
|
name: Observability Stack
|
|
done: Prometheus, Grafana, and Loki deployed and scraping targets
|
|
status: pending
|
|
owner: peerB
|
|
priority: P0
|
|
|
|
- step: S2
|
|
name: Service Telemetry Integration
|
|
done: All components (Chainfire, FlareDB, IAM, k8shost) dashboards functional
|
|
status: pending
|
|
owner: peerB
|
|
priority: P0
|
|
|
|
- step: S3
|
|
name: HA Clustering Verification
|
|
done: 3-node Chainfire/FlareDB cluster survives single node failure
|
|
status: pending
|
|
owner: peerB
|
|
priority: P0
|
|
|
|
- step: S4
|
|
name: Security Hardening
|
|
done: mTLS/TLS enabled where appropriate, secrets management verified
|
|
status: pending
|
|
owner: peerB
|
|
priority: P1
|
|
|
|
- step: S5
|
|
name: Ops Documentation
|
|
done: Runbooks for common operations (Scale out, Restore, Upgrade)
|
|
status: pending
|
|
owner: peerB
|
|
priority: P1
|
|
|
|
evidence: []
|
|
notes: |
|
|
Separated from functional feature work (T028).
|