- Created T026-practical-test task.yaml for MVP smoke testing - Added k8shost-server to flake.nix (packages, apps, overlays) - Staged all workspace directories for nix flake build - Updated flake.nix shellHook to include k8shost Resolves: T026.S1 blocker (R8 - nix submodule visibility)
4 KiB
4 KiB
PlasmaVMC ChainFire Key Schema
Date: 2025-12-08
Task: T013 S1
Status: Design Complete
Key Layout
VM Metadata
Key: /plasmavmc/vms/{org_id}/{project_id}/{vm_id}
Value: JSON-serialized VirtualMachine (plasmavmc_types::VirtualMachine)
VM Handle
Key: /plasmavmc/handles/{org_id}/{project_id}/{vm_id}
Value: JSON-serialized VmHandle (plasmavmc_types::VmHandle)
Lock Key (for atomic operations)
Key: /plasmavmc/locks/{org_id}/{project_id}/{vm_id}
Value: JSON-serialized LockInfo { timestamp: u64, node_id: String }
TTL: 30 seconds (via ChainFire lease)
Key Structure Rationale
- Prefix-based organization:
/plasmavmc/namespace isolates PlasmaVMC data - Tenant scoping:
{org_id}/{project_id}ensures multi-tenancy - Resource separation: Separate keys for VM metadata and handles enable independent updates
- Lock mechanism: Uses ChainFire lease TTL for distributed locking without manual cleanup
Serialization
- Format: JSON (via
serde_json) - Rationale: Human-readable, debuggable, compatible with existing
PersistedStatestructure - Alternative considered: bincode (rejected for debuggability)
Atomic Write Strategy
Option 1: Transaction-based (Preferred)
Use ChainFire transactions to atomically update VM + handle:
// Pseudo-code
let txn = TxnRequest {
compare: vec![Compare {
key: lock_key,
result: CompareResult::Equal,
target: CompareTarget::Version(0), // Lock doesn't exist
}],
success: vec![
RequestOp { request: Some(Request::Put(vm_put)) },
RequestOp { request: Some(Request::Put(handle_put)) },
RequestOp { request: Some(Request::Put(lock_put)) },
],
failure: vec![],
};
Option 2: Lease-based Locking (Fallback)
- Acquire lease (30s TTL)
- Put lock key with lease_id
- Update VM + handle
- Release lease (or let expire)
Fallback Behavior
File Fallback Mode
- Trigger:
PLASMAVMC_STORAGE_BACKEND=fileorPLASMAVMC_CHAINFIRE_ENDPOINTunset - Behavior: Use existing file-based persistence (
PLASMAVMC_STATE_PATH) - Locking: File-based lockfile (
{state_path}.lock) withflock()or atomic rename
Migration Path
- On startup, if ChainFire unavailable and file exists, load from file
- If ChainFire available, prefer ChainFire; migrate file → ChainFire on first write
- File fallback remains for development/testing without ChainFire cluster
Configuration
Environment Variables
PLASMAVMC_STORAGE_BACKEND:chainfire(default) |filePLASMAVMC_CHAINFIRE_ENDPOINT: ChainFire gRPC endpoint (e.g.,http://127.0.0.1:50051)PLASMAVMC_STATE_PATH: File fallback path (default:/var/run/plasmavmc/state.json)PLASMAVMC_LOCK_TTL_SECONDS: Lock TTL (default: 30)
Config File (Future)
[storage]
backend = "chainfire" # or "file"
chainfire_endpoint = "http://127.0.0.1:50051"
state_path = "/var/run/plasmavmc/state.json"
lock_ttl_seconds = 30
Operations
Create VM
- Generate
vm_id(UUID) - Acquire lock (transaction or lease)
- Put VM metadata key
- Put VM handle key
- Release lock
Update VM
- Acquire lock
- Get current VM (verify exists)
- Put updated VM metadata
- Put updated handle (if changed)
- Release lock
Delete VM
- Acquire lock
- Delete VM metadata key
- Delete VM handle key
- Release lock
Load on Startup
- Scan prefix
/plasmavmc/vms/{org_id}/{project_id}/ - For each VM key, extract
vm_id - Load VM metadata
- Load corresponding handle
- Populate in-memory DashMap
Error Handling
- ChainFire unavailable: Fall back to file mode (if configured)
- Lock contention: Retry with exponential backoff (max 3 retries)
- Serialization error: Log and return error (should not happen)
- Partial write: Transaction rollback ensures atomicity
Testing Considerations
- Unit tests: Mock ChainFire client
- Integration tests: Real ChainFire server (env-gated)
- Fallback tests: Disable ChainFire, verify file mode works
- Lock tests: Concurrent operations, verify atomicity