photoncloud-monorepo/T003-architectural-gap-analysis.md
centra 5c6eb04a46 T036: Add VM cluster deployment configs for nixos-anywhere
- netboot-base.nix with SSH key auth
- Launch scripts for node01/02/03
- Node configuration.nix and disko.nix
- Nix modules for first-boot automation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 09:59:19 +09:00

54 lines
3.3 KiB
Markdown

# Architectural Gap Analysis: Compute & Core
**Date:** 2025-12-08
**Scope:** Core Infrastructure (Chainfire, IAM, FlareDB) & Application Services (FlashDNS, PlasmaVMC)
## Executive Summary
The platform's core infrastructure ("Data" and "Identity" pillars) is in excellent shape, with implementation matching specifications closely. However, the "Compute" pillar (PlasmaVMC) exhibits a significant architectural deviation from its specification, currently existing as a monolithic prototype rather than the specified distributed control plane/agent model.
## Component Status Matrix
| Component | Role | Specification Status | Implementation Status | Alignment |
|-----------|------|----------------------|-----------------------|-----------|
| **Chainfire** | Cluster KVS | High | High | ✅ Strong |
| **Aegis (IAM)** | Identity | High | High | ✅ Strong |
| **FlareDB** | DBaaS KVS | High | High | ✅ Strong |
| **FlashDNS** | DNS Service | High | High | ✅ Strong |
| **PlasmaVMC** | VM Platform | High | **Low / Prototype** | ❌ **Mismatch** |
## Detailed Findings
### 1. Core Infrastructure (Chainfire, Aegis, FlareDB)
* **Chainfire:** Fully implemented crate structure. Detailed feature gap analysis exists (`chainfire_t003_gap_analysis.md`).
* **Aegis:** Correctly structured with `iam-server`, `iam-authn`, `iam-authz`, etc. Integration with Chainfire/FlareDB backends is present in `main.rs`.
* **FlareDB:** Correctly structured with `flaredb-pd`, `flaredb-server` (Multi-Raft), and reserved namespaces for IAM/Metrics.
### 2. Application Services (FlashDNS)
* **Status:** Excellent.
* **Evidence:** Crate structure matches spec. Integration with Chainfire (storage) and Aegis (auth) is visible in configuration and code.
### 3. Compute Platform (PlasmaVMC) - The Gap
* **Specification:** Describes a distributed system with:
* **Control Plane:** API, Scheduler, Image management.
* **Agent:** Runs on compute nodes, manages local hypervisors.
* **Communication:** gRPC between Control Plane and Agent.
* **Current Implementation:** Monolithic `plasmavmc-server`.
* The `server` binary directly initializes `HypervisorRegistry` and registers `KvmBackend`/`FireCrackerBackend`.
* **Missing Crates:**
* `plasmavmc-agent` (Critical)
* `plasmavmc-client`
* `plasmavmc-core` (Scheduler logic)
* **Implication:** The current code cannot support multi-node deployment or scheduling. It effectively runs the control plane *on* the hypervisor node.
## Recommendations
1. **Prioritize PlasmaVMC Refactoring:** The immediate engineering focus should be to split `plasmavmc-server` into:
* `plasmavmc-server` (Control Plane, Scheduler, API)
* `plasmavmc-agent` (Node status, Hypervisor control)
2. **Implement Agent Protocol:** Define the gRPC interface between Server and Agent (`agent.proto` mentioned in spec but possibly missing or unused).
3. **Leverage Existing Foundation:** The `plasmavmc-hypervisor` trait is solid. The `agent` implementation should simply wrap this existing trait, making the refactor straightforward.
## Conclusion
The project foundation is solid. The "Data" and "Identity" layers are ready for higher-level integration. The "Compute" layer requires architectural realignment to meet the distributed design goals.