photoncloud-monorepo/docs/por/T046-multi-raft-design/task.yaml
centra d2149b6249 fix(lightningstor): Fix SigV4 canonicalization for AWS S3 auth
- Replace form_urlencoded with RFC 3986 compliant URI encoding
- Implement aws_uri_encode() matching AWS SigV4 spec exactly
- Unreserved chars (A-Z,a-z,0-9,-,_,.,~) not encoded
- All other chars percent-encoded with uppercase hex
- Preserve slashes in paths, encode in query params
- Normalize empty paths to '/' per AWS spec
- Fix test expectations (body hash, HMAC values)
- Add comprehensive SigV4 signature determinism test

This fixes the canonicalization mismatch that caused signature
validation failures in T047. Auth can now be enabled for production.

Refs: T058.S1
2025-12-12 06:23:46 +09:00

291 lines
11 KiB
YAML

id: T046
name: OpenRaft-Style Multi-Raft Core Library
goal: Design and implement tick-driven Raft core with native Multi-Raft support
status: planning
priority: P1
owner: peerA
created: 2025-12-11
depends_on: [T041]
blocks: []
context: |
**Background:**
- T041: Custom Raft implementation (async/await, I/O integrated)
- Need: Unified Raft library for both ChainFire and FlareDB
- FlareDB requires Multi-Raft for sharding
**Design Direction (Updated):**
OpenRaft風のtick-driven設計で、Multi-Raft対応を最初から組み込む。
T041の実装をリファクタして、I/O分離・Ready pattern採用。
**Key Design Principles:**
1. **Tick-driven**: 外部からtick()を呼び、Ready構造体でアクションを返す
2. **I/O分離**: Raftコアは純粋ロジック、I/Oは呼び出し側が実行
3. **Multi-Raft Native**: 複数グループを効率的に管理可能な設計
4. **Single/Multi両対応**: ChainFire(single)もFlareDB(multi)も同じコアを使用
acceptance:
- OpenRaft-style tick-driven API設計完了
- Ready pattern実装
- ChainFire/FlareDB両方で使用可能
steps:
- step: S1
name: Requirements Analysis
done: Document requirements for unified Raft library
status: complete
owner: peerA
priority: P1
notes: |
**Core Requirements:**
1. **Tick-driven**: No internal timers, caller drives time
2. **Ready pattern**: Return actions instead of executing I/O
3. **Multi-Raft efficient**: Batch messages, shared tick loop
4. **Storage abstraction**: Pluggable log/state storage
5. **Single-Raft compatible**: Easy wrapper for single-group use
- step: S2
name: API Design (OpenRaft-style)
done: Design tick-driven API with Ready pattern
status: complete
owner: peerA
priority: P1
notes: |
**Core API Design:**
```rust
// raft-core/src/lib.rs
/// Pure Raft state machine - no I/O
pub struct RaftCore<S: Storage> {
id: NodeId,
state: RaftState,
storage: S, // Storage trait, not concrete impl
}
impl<S: Storage> RaftCore<S> {
/// Advance time by one tick
pub fn tick(&mut self) -> Ready {
// Check election timeout, heartbeat timeout, etc.
}
/// Process incoming message
pub fn step(&mut self, msg: Message) -> Ready {
match msg {
Message::RequestVote(req) => self.handle_request_vote(req),
Message::AppendEntries(req) => self.handle_append_entries(req),
// ...
}
}
/// Propose a new entry (client write)
pub fn propose(&mut self, data: Vec<u8>) -> Ready {
// Append to log, prepare replication
}
/// Notify that Ready actions have been processed
pub fn advance(&mut self, applied: Applied) {
// Update internal state based on what was applied
}
}
/// Actions to be executed by caller (I/O layer)
pub struct Ready {
/// Messages to send to other nodes
pub messages: Vec<(NodeId, Message)>,
/// Entries to persist to log
pub entries_to_persist: Vec<LogEntry>,
/// State to persist (term, voted_for)
pub hard_state: Option<HardState>,
/// Committed entries to apply to state machine
pub committed_entries: Vec<LogEntry>,
/// Snapshot to apply (if any)
pub snapshot: Option<Snapshot>,
}
/// Storage trait - caller provides implementation
pub trait Storage {
fn get_hard_state(&self) -> HardState;
fn get_log_entries(&self, start: u64, end: u64) -> Vec<LogEntry>;
fn last_index(&self) -> u64;
fn term_at(&self, index: u64) -> Option<u64>;
// Note: actual persist is done by caller after Ready
}
```
**Multi-Raft Coordinator:**
```rust
// multi-raft/src/lib.rs
pub struct MultiRaft<S: Storage> {
groups: HashMap<GroupId, RaftCore<S>>,
router: Router,
}
impl<S: Storage> MultiRaft<S> {
/// Tick all groups, aggregate Ready
pub fn tick(&mut self) -> MultiReady {
let mut ready = MultiReady::default();
for (gid, core) in &mut self.groups {
let r = core.tick();
ready.merge(*gid, r); // Batch messages to same peer
}
ready
}
/// Route message to appropriate group
pub fn step(&mut self, gid: GroupId, msg: Message) -> Ready {
self.groups.get_mut(&gid)?.step(msg)
}
}
/// Aggregated Ready with message batching
pub struct MultiReady {
/// Messages batched by destination: (peer, group_id, msg)
pub messages: HashMap<NodeId, Vec<(GroupId, Message)>>,
/// Per-group persistence needs
pub per_group: HashMap<GroupId, Ready>,
}
```
- step: S3
name: Architecture Decision
done: Select OpenRaft-style architecture
status: complete
owner: peerA
priority: P1
notes: |
**DECISION: Option E - OpenRaft-Style from Scratch**
**Rationale:**
1. T041実装は動作するが、I/O統合型でMulti-Raftには不向き
2. OpenRaft風のtick-driven設計なら、Single/Multi両対応が自然
3. 最初から正しい抽象化をすれば、後の拡張が容易
**Architecture:**
```
┌─────────────────────────────────────────────────────┐
│ raft-core │
│ (Pure Raft logic, no I/O, tick-driven) │
│ │
│ RaftCore::tick() → Ready │
│ RaftCore::step(msg) → Ready │
│ RaftCore::propose(data) → Ready │
└─────────────────────────────────────────────────────┘
┌─────────────┴─────────────┐
▼ ▼
┌─────────────┐ ┌─────────────────┐
│ chainfire │ │ flaredb │
│ (single) │ │ (multi) │
│ │ │ │
│ ┌─────────┐ │ │ ┌─────────────┐ │
│ │RaftNode │ │ │ │ MultiRaft │ │
│ │(wrapper)│ │ │ │ Coordinator │ │
│ └─────────┘ │ │ └─────────────┘ │
│ │ │ │ │ │
│ ┌────┴────┐ │ │ ┌─────┴───────┐ │
│ │RaftCore │ │ │ │RaftCore x N │ │
│ └─────────┘ │ │ └─────────────┘ │
└─────────────┘ └─────────────────┘
```
**vs T041 (current):**
| Aspect | T041 | T046 (new) |
|--------|------|------------|
| I/O | Integrated | Separated (Ready) |
| Timer | Internal (tokio) | External (tick) |
| Multi-Raft | Needs wrapper | Native support |
| Testability | Requires async | Pure sync tests |
- step: S4
name: Implementation Plan
done: Define implementation phases
status: complete
owner: peerA
priority: P1
notes: |
**Phase 1: Core Refactor (1 week)**
- [ ] Extract pure Raft logic from T041 core.rs
- [ ] Implement Ready pattern (no direct I/O)
- [ ] Add Storage trait abstraction
- [ ] tick() / step() / propose() API
**Phase 2: Single-Raft Wrapper (3 days)**
- [ ] ChainFire RaftNode wrapper
- [ ] Async I/O integration (tokio)
- [ ] Timer management (election/heartbeat)
- [ ] Migrate ChainFire to new core
**Phase 3: Multi-Raft Coordinator (1 week)**
- [ ] MultiRaft struct with group management
- [ ] Message batching (MultiReady)
- [ ] Shared tick loop
- [ ] FlareDB integration
**Phase 4: Advanced (deferred)**
- [ ] Shard split/merge
- [ ] Cross-shard transactions
- [ ] Snapshot coordination
**Estimated Total:** 2.5 weeks for Phase 1-3
- step: S5
name: T041 Integration Strategy
done: Plan migration from T041 to new core
status: complete
owner: peerA
priority: P1
notes: |
**Migration Strategy:**
1. **Complete T041 P3** (current)
- Finish integration tests
- Validate current impl works
2. **Extract & Refactor** (T046.P1)
- Copy T041 core.rs → raft-core/
- Remove async/I/O, add Ready pattern
- Keep original T041 as reference
3. **Parallel Operation** (T046.P2)
- Feature flag: `openraft-style` vs `legacy`
- Validate new impl matches old behavior
4. **Cutover** (T046.P3)
- Switch ChainFire to new core
- Remove legacy code
**Code Reuse from T041:**
- Election logic: ~200 LOC (RequestVote handling)
- Log replication: ~250 LOC (AppendEntries)
- Commit logic: ~150 LOC (advance_commit_index)
- Total reusable: ~600 LOC (refactor, not rewrite)
evidence:
- type: design
date: 2025-12-11
finding: "Initial hybrid approach (Option D) proposed"
- type: decision
date: 2025-12-11
finding: "User requested OpenRaft-style design; updated to Option E (tick-driven, Multi-Raft native)"
- type: architecture
date: 2025-12-11
finding: "Ready pattern + Storage trait + tick-driven API for unified Single/Multi Raft support"
notes: |
**Key Insight:**
OpenRaft風のtick-driven設計により:
- 純粋なRaftロジックをテスト可能に (no async, no I/O)
- Multi-Raftのメッセージバッチ化が自然に実現
- ChainFire/FlareDB両方で同じコアを使用可能
**T041との関係:**
- T041: 現在のカスタムRaft実装 (動作確認用)
- T046: 本番用リファクタ (OpenRaft-style)
- T041完了後、T046でリファクタを開始
**参考:**
- OpenRaft: https://github.com/databendlabs/openraft
- TiKV raft-rs: https://github.com/tikv/raft-rs