photoncloud-monorepo/flaredb/specs/001-distributed-core/spec.md
centra 8f94aee1fa Fix R8: Convert submodule gitlinks to regular directories
- Remove gitlinks (160000 mode) for chainfire, flaredb, iam
- Add workspace contents as regular tracked files
- Update flake.nix to use simple paths instead of builtins.fetchGit

This resolves the nix build failure where submodule directories
appeared empty in the nix store.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 16:51:20 +09:00

4.2 KiB

Feature Specification: Core Distributed Architecture (Phase 1)

Feature Branch: 001-distributed-core Created: 2025-11-30 Status: Draft Input: User description: "Implement the core architecture of FlareDB based on the design in chat.md..."

User Scenarios & Testing (mandatory)

User Story 1 - Core Storage Engine Verification (Priority: P1)

As a database developer, I need a robust local storage engine that supports both CAS (Compare-And-Swap) and Raw writes, so that I can build distributed logic on top of it.

Why this priority: This is the fundamental layer. Without a working storage engine with correct CAS logic, upper layers cannot function.

Independent Test: Write a Rust unit test using rdb-storage that:

  1. Creates a DB instance.
  2. Performs a raw_put.
  3. Performs a compare_and_swap that succeeds.
  4. Performs a compare_and_swap that fails due to version mismatch.

Acceptance Scenarios:

  1. Given an empty DB, When I raw_put key="k1", val="v1", Then get returns "v1".
  2. Given key="k1" with version 0 (non-existent), When I cas with expected=0, Then write succeeds and version increments.
  3. Given key="k1" with version 10, When I cas with expected=5, Then it returns a Conflict error with current version 10.

User Story 2 - Basic RPC Transport (Priority: P1)

As a client developer, I want to connect to the server via gRPC and perform basic operations, so that I can verify the communication pipeline.

Why this priority: Validates the network layer (rdb-proto, tonic integration) and the basic server shell.

Independent Test: Start rdb-server and run a minimal rdb-client script that connects and sends a request.

Acceptance Scenarios:

  1. Given a running rdb-server, When rdb-client sends a GetTsoRequest to PD (mocked or real), Then it receives a valid timestamp.
  2. Given a running rdb-server, When rdb-client sends a RawPutRequest, Then the server accepts it and it persists to disk.

User Story 3 - Placement Driver TSO (Priority: P2)

As a system, I need a source of monotonic timestamps (TSO) from rdb-pd, so that I can order transactions in the future.

Why this priority: Essential for the "Smart Client" architecture and future MVCC/CAS logic.

Independent Test: Run rdb-pd and hammer it with TSO requests from multiple threads.

Acceptance Scenarios:

  1. Given a running rdb-pd, When I request timestamps repeatedly, Then each returned timestamp is strictly greater than the previous one.

Requirements (mandatory)

Functional Requirements

  • FR-001: The project MUST be organized as a Cargo Workspace with members: rdb-proto, rdb-storage, rdb-server, rdb-pd, rdb-client.
  • FR-002: rdb-proto MUST define gRPC services (kvrpc.proto, pdpb.proto) covering CAS, Raw Put, and TSO operations.
  • FR-003: rdb-storage MUST wrap RocksDB and expose compare_and_swap(key, expected_ver, new_val) and put_raw(key, val).
  • FR-004: rdb-storage MUST store metadata (version) and data efficiently using Column Families: default (raw), cas (value as [u64_be version][bytes value]), and raft_log/raft_state for Raft metadata.
  • FR-005: rdb-pd MUST implement a TSO (Timestamp Oracle) service providing unique, monotonic u64 timestamps.
  • FR-006: rdb-server MUST implement the gRPC handlers defined in rdb-proto and delegate to rdb-storage.
  • FR-007: rdb-client MUST provide a Rust API that abstracts the gRPC calls for cas_put, raw_put, and get.

Key Entities

  • Region: A logical range of keys (for future sharding).
  • Version: A u64 representing the modification timestamp/sequence of a key.
  • TSO: Global Timestamp Oracle.

Success Criteria (mandatory)

Measurable Outcomes

  • SC-001: Full workspace compiles with cargo build.
  • SC-002: rdb-storage unit tests pass covering CAS success/failure paths.
  • SC-003: Integration script (scripts/verify-core.sh) or equivalent CI step runs end-to-end: start PD and Server, client obtains TSO, performs RawPut and RawGet (value must match), performs CAS success and CAS conflict, exits 0.