id: T046 name: OpenRaft-Style Multi-Raft Core Library goal: Design and implement tick-driven Raft core with native Multi-Raft support status: planning priority: P1 owner: peerA created: 2025-12-11 depends_on: [T041] blocks: [] context: | **Background:** - T041: Custom Raft implementation (async/await, I/O integrated) - Need: Unified Raft library for both ChainFire and FlareDB - FlareDB requires Multi-Raft for sharding **Design Direction (Updated):** OpenRaft風のtick-driven設計で、Multi-Raft対応を最初から組み込む。 T041の実装をリファクタして、I/O分離・Ready pattern採用。 **Key Design Principles:** 1. **Tick-driven**: 外部からtick()を呼び、Ready構造体でアクションを返す 2. **I/O分離**: Raftコアは純粋ロジック、I/Oは呼び出し側が実行 3. **Multi-Raft Native**: 複数グループを効率的に管理可能な設計 4. **Single/Multi両対応**: ChainFire(single)もFlareDB(multi)も同じコアを使用 acceptance: - OpenRaft-style tick-driven API設計完了 - Ready pattern実装 - ChainFire/FlareDB両方で使用可能 steps: - step: S1 name: Requirements Analysis done: Document requirements for unified Raft library status: complete owner: peerA priority: P1 notes: | **Core Requirements:** 1. **Tick-driven**: No internal timers, caller drives time 2. **Ready pattern**: Return actions instead of executing I/O 3. **Multi-Raft efficient**: Batch messages, shared tick loop 4. **Storage abstraction**: Pluggable log/state storage 5. **Single-Raft compatible**: Easy wrapper for single-group use - step: S2 name: API Design (OpenRaft-style) done: Design tick-driven API with Ready pattern status: complete owner: peerA priority: P1 notes: | **Core API Design:** ```rust // raft-core/src/lib.rs /// Pure Raft state machine - no I/O pub struct RaftCore { id: NodeId, state: RaftState, storage: S, // Storage trait, not concrete impl } impl RaftCore { /// Advance time by one tick pub fn tick(&mut self) -> Ready { // Check election timeout, heartbeat timeout, etc. } /// Process incoming message pub fn step(&mut self, msg: Message) -> Ready { match msg { Message::RequestVote(req) => self.handle_request_vote(req), Message::AppendEntries(req) => self.handle_append_entries(req), // ... } } /// Propose a new entry (client write) pub fn propose(&mut self, data: Vec) -> Ready { // Append to log, prepare replication } /// Notify that Ready actions have been processed pub fn advance(&mut self, applied: Applied) { // Update internal state based on what was applied } } /// Actions to be executed by caller (I/O layer) pub struct Ready { /// Messages to send to other nodes pub messages: Vec<(NodeId, Message)>, /// Entries to persist to log pub entries_to_persist: Vec, /// State to persist (term, voted_for) pub hard_state: Option, /// Committed entries to apply to state machine pub committed_entries: Vec, /// Snapshot to apply (if any) pub snapshot: Option, } /// Storage trait - caller provides implementation pub trait Storage { fn get_hard_state(&self) -> HardState; fn get_log_entries(&self, start: u64, end: u64) -> Vec; fn last_index(&self) -> u64; fn term_at(&self, index: u64) -> Option; // Note: actual persist is done by caller after Ready } ``` **Multi-Raft Coordinator:** ```rust // multi-raft/src/lib.rs pub struct MultiRaft { groups: HashMap>, router: Router, } impl MultiRaft { /// Tick all groups, aggregate Ready pub fn tick(&mut self) -> MultiReady { let mut ready = MultiReady::default(); for (gid, core) in &mut self.groups { let r = core.tick(); ready.merge(*gid, r); // Batch messages to same peer } ready } /// Route message to appropriate group pub fn step(&mut self, gid: GroupId, msg: Message) -> Ready { self.groups.get_mut(&gid)?.step(msg) } } /// Aggregated Ready with message batching pub struct MultiReady { /// Messages batched by destination: (peer, group_id, msg) pub messages: HashMap>, /// Per-group persistence needs pub per_group: HashMap, } ``` - step: S3 name: Architecture Decision done: Select OpenRaft-style architecture status: complete owner: peerA priority: P1 notes: | **DECISION: Option E - OpenRaft-Style from Scratch** **Rationale:** 1. T041実装は動作するが、I/O統合型でMulti-Raftには不向き 2. OpenRaft風のtick-driven設計なら、Single/Multi両対応が自然 3. 最初から正しい抽象化をすれば、後の拡張が容易 **Architecture:** ``` ┌─────────────────────────────────────────────────────┐ │ raft-core │ │ (Pure Raft logic, no I/O, tick-driven) │ │ │ │ RaftCore::tick() → Ready │ │ RaftCore::step(msg) → Ready │ │ RaftCore::propose(data) → Ready │ └─────────────────────────────────────────────────────┘ │ ┌─────────────┴─────────────┐ ▼ ▼ ┌─────────────┐ ┌─────────────────┐ │ chainfire │ │ flaredb │ │ (single) │ │ (multi) │ │ │ │ │ │ ┌─────────┐ │ │ ┌─────────────┐ │ │ │RaftNode │ │ │ │ MultiRaft │ │ │ │(wrapper)│ │ │ │ Coordinator │ │ │ └─────────┘ │ │ └─────────────┘ │ │ │ │ │ │ │ │ ┌────┴────┐ │ │ ┌─────┴───────┐ │ │ │RaftCore │ │ │ │RaftCore x N │ │ │ └─────────┘ │ │ └─────────────┘ │ └─────────────┘ └─────────────────┘ ``` **vs T041 (current):** | Aspect | T041 | T046 (new) | |--------|------|------------| | I/O | Integrated | Separated (Ready) | | Timer | Internal (tokio) | External (tick) | | Multi-Raft | Needs wrapper | Native support | | Testability | Requires async | Pure sync tests | - step: S4 name: Implementation Plan done: Define implementation phases status: complete owner: peerA priority: P1 notes: | **Phase 1: Core Refactor (1 week)** - [ ] Extract pure Raft logic from T041 core.rs - [ ] Implement Ready pattern (no direct I/O) - [ ] Add Storage trait abstraction - [ ] tick() / step() / propose() API **Phase 2: Single-Raft Wrapper (3 days)** - [ ] ChainFire RaftNode wrapper - [ ] Async I/O integration (tokio) - [ ] Timer management (election/heartbeat) - [ ] Migrate ChainFire to new core **Phase 3: Multi-Raft Coordinator (1 week)** - [ ] MultiRaft struct with group management - [ ] Message batching (MultiReady) - [ ] Shared tick loop - [ ] FlareDB integration **Phase 4: Advanced (deferred)** - [ ] Shard split/merge - [ ] Cross-shard transactions - [ ] Snapshot coordination **Estimated Total:** 2.5 weeks for Phase 1-3 - step: S5 name: T041 Integration Strategy done: Plan migration from T041 to new core status: complete owner: peerA priority: P1 notes: | **Migration Strategy:** 1. **Complete T041 P3** (current) - Finish integration tests - Validate current impl works 2. **Extract & Refactor** (T046.P1) - Copy T041 core.rs → raft-core/ - Remove async/I/O, add Ready pattern - Keep original T041 as reference 3. **Parallel Operation** (T046.P2) - Feature flag: `openraft-style` vs `legacy` - Validate new impl matches old behavior 4. **Cutover** (T046.P3) - Switch ChainFire to new core - Remove legacy code **Code Reuse from T041:** - Election logic: ~200 LOC (RequestVote handling) - Log replication: ~250 LOC (AppendEntries) - Commit logic: ~150 LOC (advance_commit_index) - Total reusable: ~600 LOC (refactor, not rewrite) evidence: - type: design date: 2025-12-11 finding: "Initial hybrid approach (Option D) proposed" - type: decision date: 2025-12-11 finding: "User requested OpenRaft-style design; updated to Option E (tick-driven, Multi-Raft native)" - type: architecture date: 2025-12-11 finding: "Ready pattern + Storage trait + tick-driven API for unified Single/Multi Raft support" notes: | **Key Insight:** OpenRaft風のtick-driven設計により: - 純粋なRaftロジックをテスト可能に (no async, no I/O) - Multi-Raftのメッセージバッチ化が自然に実現 - ChainFire/FlareDB両方で同じコアを使用可能 **T041との関係:** - T041: 現在のカスタムRaft実装 (動作確認用) - T046: 本番用リファクタ (OpenRaft-style) - T041完了後、T046でリファクタを開始 **参考:** - OpenRaft: https://github.com/databendlabs/openraft - TiKV raft-rs: https://github.com/tikv/raft-rs