- Remove gitlinks (160000 mode) for chainfire, flaredb, iam - Add workspace contents as regular tracked files - Update flake.nix to use simple paths instead of builtins.fetchGit This resolves the nix build failure where submodule directories appeared empty in the nix store. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
125 lines
5.6 KiB
Markdown
125 lines
5.6 KiB
Markdown
---
|
||
description: "Task list for Multi-Raft (Static -> Split -> Move)"
|
||
---
|
||
|
||
# Tasks: Multi-Raft (Static -> Split -> Move)
|
||
|
||
**Input**: Design documents from `/specs/004-multi-raft/`
|
||
**Prerequisites**: plan.md (required), spec.md (user stories), research.md, data-model.md, contracts/
|
||
|
||
**Tests**: Required per constitution; include unit/integration tests for multi-region routing, split, confchange/move.
|
||
|
||
**Organization**: Tasks are grouped by user story to enable independent implementation and testing.
|
||
|
||
## Format: `[ID] [P?] [Story] Description`
|
||
|
||
- **[P]**: Can run in parallel (different files, no dependencies)
|
||
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
|
||
- Include exact file paths in descriptions
|
||
|
||
## Phase 1: Setup (Shared Infrastructure)
|
||
|
||
**Purpose**: Prepare store/container and region-aware routing foundations.
|
||
|
||
- [X] T001 Add Store container skeleton managing RegionID->Peer map in `rdb-server/src/store.rs`
|
||
- [X] T002 Wire RaftService to dispatch by region_id via Store in `rdb-server/src/raft_service.rs`
|
||
- [X] T003 Add region-aware KV routing (Key->Region) stub in `rdb-server/src/service.rs`
|
||
- [X] T004 Region-prefixed Raft storage keys to isolate logs/hs/conf in `rdb-server/src/raft_storage.rs`
|
||
- [X] T005 Update main startup to init Store from PD initial region meta in `rdb-server/src/main.rs`
|
||
|
||
---
|
||
|
||
## Phase 2: Foundational (Blocking Prerequisites)
|
||
|
||
**Purpose**: PD integration and routing validation.
|
||
|
||
- [X] T006 Add PD client call to fetch initial region metadata in `rdb-proto/src/pdpb.proto` and `rdb-server/src/main.rs`
|
||
- [X] T007 Add routing cache (Region range map) with PD heartbeat refresh in `rdb-server/src/service.rs`
|
||
- [X] T008 Add multi-region Raft message dispatch tests in `rdb-server/tests/test_multi_region.rs`
|
||
- [X] T009 Add KV routing tests for disjoint regions in `rdb-server/tests/test_multi_region.rs`
|
||
|
||
**Checkpoint**: Multiple regions can start, elect leaders, and route KV without interference.
|
||
|
||
---
|
||
|
||
## Phase 3: User Story 1 - PD主導の複数Region起動 (Priority: P1)
|
||
|
||
**Goal**: Auto-start multiple regions from PD meta; independent read/write per region.
|
||
|
||
### Tests
|
||
- [X] T010 [US1] Integration test: startup with PD returning 2 regions; both elect leaders and accept writes in `rdb-server/tests/test_multi_region.rs`
|
||
|
||
### Implementation
|
||
- [X] T011 [US1] Store registers peers per PD region meta; validation for overlapping ranges in `rdb-server/src/store.rs`
|
||
- [X] T012 [US1] KV service uses region router from PD meta to propose to correct peer in `rdb-server/src/service.rs`
|
||
- [X] T013 [US1] Structured errors for unknown region/key-range in `rdb-server/src/service.rs`
|
||
|
||
**Checkpoint**: Two+ regions operate independently with PD-provided meta.
|
||
|
||
---
|
||
|
||
## Phase 4: User Story 2 - Region Split (Priority: P1)
|
||
|
||
**Goal**: Detect size threshold and split online into two regions.
|
||
|
||
### Tests
|
||
- [X] T014 [US2] Split trigger test (approx size over threshold) in `rdb-server/tests/test_split.rs`
|
||
- [X] T015 [US2] Post-split routing test: keys before/after split_key go to correct regions in `rdb-server/tests/test_split.rs`
|
||
|
||
### Implementation
|
||
- [X] T016 [US2] Approximate size measurement and threshold check in `rdb-server/src/store.rs`
|
||
- [X] T017 [US2] Define/apply Split raft command; update region meta atomically in `rdb-server/src/peer.rs`
|
||
- [X] T018 [US2] Create/register new peer for split region and update routing map in `rdb-server/src/store.rs`
|
||
- [X] T019 [US2] Persist updated region metadata (start/end keys) in `rdb-server/src/store.rs`
|
||
|
||
**Checkpoint**: Region splits online; post-split read/write succeeds in both regions.
|
||
|
||
---
|
||
|
||
## Phase 5: User Story 3 - Region Move (Priority: P2)
|
||
|
||
**Goal**: Rebalance region replicas via ConfChange (add → catch-up → remove).
|
||
|
||
### Tests
|
||
- [X] T020 [US3] ConfChange add/remove replica test across two stores in `rdb-server/tests/test_confchange.rs`
|
||
- [X] T021 [US3] Move scenario: PD directs move, data reachable after move in `rdb-server/tests/test_confchange.rs`
|
||
|
||
### Implementation
|
||
- [X] T022 [US3] Implement ConfChange apply for add/remove node per region in `rdb-server/src/peer.rs`
|
||
- [X] T023 [US3] PD heartbeat reporting region list/size and apply PD move directives in `rdb-server/src/store.rs`
|
||
- [X] T024 [US3] Snapshot/fast catch-up path for new replica join in `rdb-server/src/peer.rs`
|
||
|
||
**Checkpoint**: Region can move between stores without data loss; quorum maintained.
|
||
|
||
---
|
||
|
||
## Phase 6: Polish & Cross-Cutting Concerns
|
||
|
||
**Purpose**: Hardening, docs, and verification.
|
||
|
||
- [X] T025 Update contracts for PD/Region RPCs in `specs/004-multi-raft/contracts/`
|
||
- [X] T026 Update data-model for Region/Store/PlacementMeta in `specs/004-multi-raft/data-model.md`
|
||
- [X] T027 Quickstart covering multi-region start, split, move flows in `specs/004-multi-raft/quickstart.md`
|
||
- [X] T028 Verification script to run multi-region/split/move tests in `scripts/verify-multiraft.sh`
|
||
- [ ] T029 [P] Cleanup warnings, run `cargo fmt`, `cargo test -p rdb-server --tests` across workspace
|
||
|
||
---
|
||
|
||
## Dependencies & Execution Order
|
||
|
||
- Phase 1 → Phase 2 → US1 → US2 → US3 → Polish
|
||
- Split (US2) depends on routing in US1; Move (US3) depends on ConfChange plumbing.
|
||
|
||
## Parallel Examples
|
||
|
||
- T008 and T009 can run in parallel after T002/T003/T004 (multi-region dispatch + routing tests).
|
||
- T014 and T015 can run in parallel after routing map is in place (post-split tests).
|
||
- T020 and T021 can run in parallel once ConfChange scaffolding exists.
|
||
|
||
## Implementation Strategy
|
||
|
||
1) Lay Store/routing foundations (Phase 1–2).
|
||
2) Deliver US1 (PD-driven multi-region start).
|
||
3) Add Split path (US2).
|
||
4) Add ConfChange/move path (US3).
|
||
5) Polish docs/contracts/verify script.
|