photoncloud-monorepo/flaredb/specs/004-multi-raft/tasks.md
centra 8f94aee1fa Fix R8: Convert submodule gitlinks to regular directories
- Remove gitlinks (160000 mode) for chainfire, flaredb, iam
- Add workspace contents as regular tracked files
- Update flake.nix to use simple paths instead of builtins.fetchGit

This resolves the nix build failure where submodule directories
appeared empty in the nix store.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 16:51:20 +09:00

125 lines
5.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
description: "Task list for Multi-Raft (Static -> Split -> Move)"
---
# Tasks: Multi-Raft (Static -> Split -> Move)
**Input**: Design documents from `/specs/004-multi-raft/`
**Prerequisites**: plan.md (required), spec.md (user stories), research.md, data-model.md, contracts/
**Tests**: Required per constitution; include unit/integration tests for multi-region routing, split, confchange/move.
**Organization**: Tasks are grouped by user story to enable independent implementation and testing.
## Format: `[ID] [P?] [Story] Description`
- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
- Include exact file paths in descriptions
## Phase 1: Setup (Shared Infrastructure)
**Purpose**: Prepare store/container and region-aware routing foundations.
- [X] T001 Add Store container skeleton managing RegionID->Peer map in `rdb-server/src/store.rs`
- [X] T002 Wire RaftService to dispatch by region_id via Store in `rdb-server/src/raft_service.rs`
- [X] T003 Add region-aware KV routing (Key->Region) stub in `rdb-server/src/service.rs`
- [X] T004 Region-prefixed Raft storage keys to isolate logs/hs/conf in `rdb-server/src/raft_storage.rs`
- [X] T005 Update main startup to init Store from PD initial region meta in `rdb-server/src/main.rs`
---
## Phase 2: Foundational (Blocking Prerequisites)
**Purpose**: PD integration and routing validation.
- [X] T006 Add PD client call to fetch initial region metadata in `rdb-proto/src/pdpb.proto` and `rdb-server/src/main.rs`
- [X] T007 Add routing cache (Region range map) with PD heartbeat refresh in `rdb-server/src/service.rs`
- [X] T008 Add multi-region Raft message dispatch tests in `rdb-server/tests/test_multi_region.rs`
- [X] T009 Add KV routing tests for disjoint regions in `rdb-server/tests/test_multi_region.rs`
**Checkpoint**: Multiple regions can start, elect leaders, and route KV without interference.
---
## Phase 3: User Story 1 - PD主導の複数Region起動 (Priority: P1)
**Goal**: Auto-start multiple regions from PD meta; independent read/write per region.
### Tests
- [X] T010 [US1] Integration test: startup with PD returning 2 regions; both elect leaders and accept writes in `rdb-server/tests/test_multi_region.rs`
### Implementation
- [X] T011 [US1] Store registers peers per PD region meta; validation for overlapping ranges in `rdb-server/src/store.rs`
- [X] T012 [US1] KV service uses region router from PD meta to propose to correct peer in `rdb-server/src/service.rs`
- [X] T013 [US1] Structured errors for unknown region/key-range in `rdb-server/src/service.rs`
**Checkpoint**: Two+ regions operate independently with PD-provided meta.
---
## Phase 4: User Story 2 - Region Split (Priority: P1)
**Goal**: Detect size threshold and split online into two regions.
### Tests
- [X] T014 [US2] Split trigger test (approx size over threshold) in `rdb-server/tests/test_split.rs`
- [X] T015 [US2] Post-split routing test: keys before/after split_key go to correct regions in `rdb-server/tests/test_split.rs`
### Implementation
- [X] T016 [US2] Approximate size measurement and threshold check in `rdb-server/src/store.rs`
- [X] T017 [US2] Define/apply Split raft command; update region meta atomically in `rdb-server/src/peer.rs`
- [X] T018 [US2] Create/register new peer for split region and update routing map in `rdb-server/src/store.rs`
- [X] T019 [US2] Persist updated region metadata (start/end keys) in `rdb-server/src/store.rs`
**Checkpoint**: Region splits online; post-split read/write succeeds in both regions.
---
## Phase 5: User Story 3 - Region Move (Priority: P2)
**Goal**: Rebalance region replicas via ConfChange (add → catch-up → remove).
### Tests
- [X] T020 [US3] ConfChange add/remove replica test across two stores in `rdb-server/tests/test_confchange.rs`
- [X] T021 [US3] Move scenario: PD directs move, data reachable after move in `rdb-server/tests/test_confchange.rs`
### Implementation
- [X] T022 [US3] Implement ConfChange apply for add/remove node per region in `rdb-server/src/peer.rs`
- [X] T023 [US3] PD heartbeat reporting region list/size and apply PD move directives in `rdb-server/src/store.rs`
- [X] T024 [US3] Snapshot/fast catch-up path for new replica join in `rdb-server/src/peer.rs`
**Checkpoint**: Region can move between stores without data loss; quorum maintained.
---
## Phase 6: Polish & Cross-Cutting Concerns
**Purpose**: Hardening, docs, and verification.
- [X] T025 Update contracts for PD/Region RPCs in `specs/004-multi-raft/contracts/`
- [X] T026 Update data-model for Region/Store/PlacementMeta in `specs/004-multi-raft/data-model.md`
- [X] T027 Quickstart covering multi-region start, split, move flows in `specs/004-multi-raft/quickstart.md`
- [X] T028 Verification script to run multi-region/split/move tests in `scripts/verify-multiraft.sh`
- [ ] T029 [P] Cleanup warnings, run `cargo fmt`, `cargo test -p rdb-server --tests` across workspace
---
## Dependencies & Execution Order
- Phase 1 → Phase 2 → US1 → US2 → US3 → Polish
- Split (US2) depends on routing in US1; Move (US3) depends on ConfChange plumbing.
## Parallel Examples
- T008 and T009 can run in parallel after T002/T003/T004 (multi-region dispatch + routing tests).
- T014 and T015 can run in parallel after routing map is in place (post-split tests).
- T020 and T021 can run in parallel once ConfChange scaffolding exists.
## Implementation Strategy
1) Lay Store/routing foundations (Phase 12).
2) Deliver US1 (PD-driven multi-region start).
3) Add Split path (US2).
4) Add ConfChange/move path (US3).
5) Polish docs/contracts/verify script.