photoncloud-monorepo/docs/por/T020-flaredb-metadata/design.md
centra a7ec7e2158 Add T026 practical test + k8shost to flake + workspace files
- Created T026-practical-test task.yaml for MVP smoke testing
- Added k8shost-server to flake.nix (packages, apps, overlays)
- Staged all workspace directories for nix flake build
- Updated flake.nix shellHook to include k8shost

Resolves: T026.S1 blocker (R8 - nix submodule visibility)
2025-12-09 06:07:50 +09:00

3.7 KiB

FlareDB Metadata Adoption Design

Date: 2025-12-08 Task: T020 Status: Design Phase

1. Problem Statement

Current services (LightningSTOR, FlashDNS, FiberLB) and the upcoming NovaNET (T019) use ChainFire (Raft+Gossip) for metadata storage. ChainFire is intended for cluster membership, not general-purpose metadata. FlareDB is the designated DBaaS/Metadata store, offering better scalability and strong consistency (CAS) modes.

2. Gap Analysis

To replace ChainFire with FlareDB, we need:

  1. Delete Operations: ChainFire supports delete(key). FlareDB currently supports only Put/Get/Scan (Raw) and CAS/Get/Scan (Strong). CasWrite in Raft only inserts/updates.
  2. Prefix Scan: ChainFire has get_prefix(prefix). FlareDB has Scan(start, end). Client wrapper needed.
  3. Atomic Updates: ChainFire uses simple LWW or transactions. FlareDB KvCas provides CompareAndSwap which is superior for metadata consistency.

3. Protocol Extensions (T020.S2)

3.1 Proto (kvrpc.proto)

Add Delete to KvCas (Strong Consistency):

service KvCas {
  // ...
  rpc CompareAndDelete(CasDeleteRequest) returns (CasDeleteResponse);
}

message CasDeleteRequest {
  bytes key = 1;
  uint64 expected_version = 2; // Required for safe deletion
  string namespace = 3;
}

message CasDeleteResponse {
  bool success = 1;
  uint64 current_version = 2; // If failure
}

Add RawDelete to KvRaw (Eventual Consistency):

service KvRaw {
  // ...
  rpc RawDelete(RawDeleteRequest) returns (RawDeleteResponse);
}

message RawDeleteRequest {
  bytes key = 1;
  string namespace = 2;
}

message RawDeleteResponse {
  bool success = 1;
}

3.2 Raft Request (types.rs)

Add CasDelete and KvDelete to FlareRequest:

pub enum FlareRequest {
    // ...
    KvDelete {
        namespace_id: u32,
        key: Vec<u8>,
        ts: u64,
    },
    CasDelete {
        namespace_id: u32,
        key: Vec<u8>,
        expected_version: u64,
        ts: u64,
    },
}

3.3 State Machine (storage.rs)

Update apply_request to handle deletion:

  • KvDelete: Remove from kv_data.
  • CasDelete: Check expected_version matches current_version. If yes, remove from cas_data.

4. Client Extensions (RdbClient)

impl RdbClient {
    // Strong Consistency
    pub async fn cas_delete(&mut self, key: Vec<u8>, expected_version: u64) -> Result<bool, Status>;
    
    // Eventual Consistency
    pub async fn raw_delete(&mut self, key: Vec<u8>) -> Result<(), Status>;

    // Helper
    pub async fn scan_prefix(&mut self, prefix: Vec<u8>) -> Result<Vec<(Vec<u8>, Vec<u8>)>, Status> {
        // Calculate end_key = prefix + 1 (lexicographically)
        let start = prefix.clone();
        let end = calculate_successor(&prefix); 
        self.cas_scan(start, end, ...)
    }
}

5. Schema Migration

Mapping ChainFire keys to FlareDB keys:

  • Namespace: Use default or service-specific (e.g., fiberlb, novanet).
  • Keys: Keep same hierarchical path structure (e.g., /fiberlb/loadbalancers/...).
  • Values: JSON strings (UTF-8 bytes).
Service Key Prefix FlareDB Namespace Mode
FiberLB /fiberlb/ fiberlb Strong (CAS)
FlashDNS /flashdns/ flashdns Strong (CAS)
LightningSTOR /lightningstor/ lightningstor Strong (CAS)
NovaNET /novanet/ novanet Strong (CAS)
PlasmaVMC /plasmavmc/ plasmavmc Strong (CAS)

6. Migration Strategy

  1. Implement Delete support (T020.S2).
  2. Create FlareDbMetadataStore implementation in each service alongside ChainFireMetadataStore.
  3. Switch configuration to use FlareDB.
  4. (Optional) Write migration tool to copy ChainFire -> FlareDB.