# Research: Raft Core Replication (002-raft-features) ## Decisions - **Raft library**: Use `raft` (tikv/raft-rs 0.7, prost-codec). - *Rationale*: Battle-tested implementation, already wired in repo; supports necessary APIs for storage/transport. - *Alternatives considered*: `openraft` (heavier refactor), custom Raft (too risky/time-consuming). - **Log/State persistence**: Persist log entries, hard state, conf state in RocksDB CFs (`raft_log`, `raft_state`). - *Rationale*: RocksDB already provisioned and used; column families align with separation of concerns; durable restart semantics. - *Alternatives considered*: In-memory (unsafe for recovery), separate files (adds new IO path, no benefit). - **Cluster scope**: Fixed 3-node membership for this phase; no dynamic add/remove. - *Rationale*: Matches spec clarification; reduces scope to core replication/recovery; simpler correctness surface. - *Alternatives considered*: Joint consensus/dynamic membership (out of scope now). - **Transport**: Continue with tonic/prost gRPC messages for Raft network exchange. - *Rationale*: Existing RaftService in repo; shared proto tooling; avoids new protocol surface. - *Alternatives considered*: custom TCP/UDP transport (unnecessary for current goals). - **Testing approach**: Unit tests for storage/persistence; single-node campaign/propose; multi-node integration harness to validate majority commit and follower catch-up. - *Rationale*: Aligns with constitution Test-First; exercises durability and replication behaviors. - *Alternatives considered*: manual ad-hoc testing (insufficient coverage).