- Created T026-practical-test task.yaml for MVP smoke testing - Added k8shost-server to flake.nix (packages, apps, overlays) - Staged all workspace directories for nix flake build - Updated flake.nix shellHook to include k8shost Resolves: T026.S1 blocker (R8 - nix submodule visibility)
138 lines
4 KiB
Markdown
138 lines
4 KiB
Markdown
# PlasmaVMC ChainFire Key Schema
|
|
|
|
**Date:** 2025-12-08
|
|
**Task:** T013 S1
|
|
**Status:** Design Complete
|
|
|
|
## Key Layout
|
|
|
|
### VM Metadata
|
|
```
|
|
Key: /plasmavmc/vms/{org_id}/{project_id}/{vm_id}
|
|
Value: JSON-serialized VirtualMachine (plasmavmc_types::VirtualMachine)
|
|
```
|
|
|
|
### VM Handle
|
|
```
|
|
Key: /plasmavmc/handles/{org_id}/{project_id}/{vm_id}
|
|
Value: JSON-serialized VmHandle (plasmavmc_types::VmHandle)
|
|
```
|
|
|
|
### Lock Key (for atomic operations)
|
|
```
|
|
Key: /plasmavmc/locks/{org_id}/{project_id}/{vm_id}
|
|
Value: JSON-serialized LockInfo { timestamp: u64, node_id: String }
|
|
TTL: 30 seconds (via ChainFire lease)
|
|
```
|
|
|
|
## Key Structure Rationale
|
|
|
|
1. **Prefix-based organization**: `/plasmavmc/` namespace isolates PlasmaVMC data
|
|
2. **Tenant scoping**: `{org_id}/{project_id}` ensures multi-tenancy
|
|
3. **Resource separation**: Separate keys for VM metadata and handles enable independent updates
|
|
4. **Lock mechanism**: Uses ChainFire lease TTL for distributed locking without manual cleanup
|
|
|
|
## Serialization
|
|
|
|
- **Format**: JSON (via `serde_json`)
|
|
- **Rationale**: Human-readable, debuggable, compatible with existing `PersistedState` structure
|
|
- **Alternative considered**: bincode (rejected for debuggability)
|
|
|
|
## Atomic Write Strategy
|
|
|
|
### Option 1: Transaction-based (Preferred)
|
|
Use ChainFire transactions to atomically update VM + handle:
|
|
```rust
|
|
// Pseudo-code
|
|
let txn = TxnRequest {
|
|
compare: vec![Compare {
|
|
key: lock_key,
|
|
result: CompareResult::Equal,
|
|
target: CompareTarget::Version(0), // Lock doesn't exist
|
|
}],
|
|
success: vec![
|
|
RequestOp { request: Some(Request::Put(vm_put)) },
|
|
RequestOp { request: Some(Request::Put(handle_put)) },
|
|
RequestOp { request: Some(Request::Put(lock_put)) },
|
|
],
|
|
failure: vec![],
|
|
};
|
|
```
|
|
|
|
### Option 2: Lease-based Locking (Fallback)
|
|
1. Acquire lease (30s TTL)
|
|
2. Put lock key with lease_id
|
|
3. Update VM + handle
|
|
4. Release lease (or let expire)
|
|
|
|
## Fallback Behavior
|
|
|
|
### File Fallback Mode
|
|
- **Trigger**: `PLASMAVMC_STORAGE_BACKEND=file` or `PLASMAVMC_CHAINFIRE_ENDPOINT` unset
|
|
- **Behavior**: Use existing file-based persistence (`PLASMAVMC_STATE_PATH`)
|
|
- **Locking**: File-based lockfile (`{state_path}.lock`) with `flock()` or atomic rename
|
|
|
|
### Migration Path
|
|
1. On startup, if ChainFire unavailable and file exists, load from file
|
|
2. If ChainFire available, prefer ChainFire; migrate file → ChainFire on first write
|
|
3. File fallback remains for development/testing without ChainFire cluster
|
|
|
|
## Configuration
|
|
|
|
### Environment Variables
|
|
- `PLASMAVMC_STORAGE_BACKEND`: `chainfire` (default) | `file`
|
|
- `PLASMAVMC_CHAINFIRE_ENDPOINT`: ChainFire gRPC endpoint (e.g., `http://127.0.0.1:50051`)
|
|
- `PLASMAVMC_STATE_PATH`: File fallback path (default: `/var/run/plasmavmc/state.json`)
|
|
- `PLASMAVMC_LOCK_TTL_SECONDS`: Lock TTL (default: 30)
|
|
|
|
### Config File (Future)
|
|
```toml
|
|
[storage]
|
|
backend = "chainfire" # or "file"
|
|
chainfire_endpoint = "http://127.0.0.1:50051"
|
|
state_path = "/var/run/plasmavmc/state.json"
|
|
lock_ttl_seconds = 30
|
|
```
|
|
|
|
## Operations
|
|
|
|
### Create VM
|
|
1. Generate `vm_id` (UUID)
|
|
2. Acquire lock (transaction or lease)
|
|
3. Put VM metadata key
|
|
4. Put VM handle key
|
|
5. Release lock
|
|
|
|
### Update VM
|
|
1. Acquire lock
|
|
2. Get current VM (verify exists)
|
|
3. Put updated VM metadata
|
|
4. Put updated handle (if changed)
|
|
5. Release lock
|
|
|
|
### Delete VM
|
|
1. Acquire lock
|
|
2. Delete VM metadata key
|
|
3. Delete VM handle key
|
|
4. Release lock
|
|
|
|
### Load on Startup
|
|
1. Scan prefix `/plasmavmc/vms/{org_id}/{project_id}/`
|
|
2. For each VM key, extract `vm_id`
|
|
3. Load VM metadata
|
|
4. Load corresponding handle
|
|
5. Populate in-memory DashMap
|
|
|
|
## Error Handling
|
|
|
|
- **ChainFire unavailable**: Fall back to file mode (if configured)
|
|
- **Lock contention**: Retry with exponential backoff (max 3 retries)
|
|
- **Serialization error**: Log and return error (should not happen)
|
|
- **Partial write**: Transaction rollback ensures atomicity
|
|
|
|
## Testing Considerations
|
|
|
|
- Unit tests: Mock ChainFire client
|
|
- Integration tests: Real ChainFire server (env-gated)
|
|
- Fallback tests: Disable ChainFire, verify file mode works
|
|
- Lock tests: Concurrent operations, verify atomicity
|