# Feature Specification: Raft Core Replication **Feature Branch**: `002-raft-features` **Created**: 2025-12-01 **Status**: Draft **Input**: User description: "Raft関連の機能についてお願いします。" ## Clarifications ### Session 2025-12-01 - Q: Should this phase assume fixed 3-node membership or include dynamic membership? → A: Fixed 3-node, extensible for future scaling. ## User Scenarios & Testing *(mandatory)* ### User Story 1 - Single-Node Raft Baseline (Priority: P1) As a platform engineer, I want a single-node Raft instance to accept proposals, elect a leader, and persist committed entries so I can validate the log/storage plumbing before scaling out. **Why this priority**: Establishes correctness of log append/apply and persistence; blocks multi-node rollout. **Independent Test**: Start one node, trigger self-election, propose an entry, verify it is committed and applied to storage with the expected data. **Acceptance Scenarios**: 1. **Given** a single node started fresh, **When** it campaigns, **Then** it becomes leader and can accept proposals. 2. **Given** a proposed entry "e1", **When** it commits, **Then** storage contains "e1" and last index increments by 1. --- ### User Story 2 - Multi-Node Replication (Priority: P1) As a platform engineer, I want a 3-node Raft cluster to replicate entries to a majority so that writes remain durable under follower failure. **Why this priority**: Majority replication is the core availability guarantee of Raft. **Independent Test**: Start 3 nodes, elect a leader, propose an entry; verify leader and at least one follower store the entry at the same index/term and report commit. **Acceptance Scenarios**: 1. **Given** a 3-node cluster, **When** a leader is elected, **Then** at least two nodes acknowledge commit for the same index/term. 2. **Given** a committed entry on the leader, **When** one follower is stopped, **Then** the other follower still receives and persists the entry. --- ### User Story 3 - Failure and Recovery (Priority: P2) As an operator, I want a stopped follower to recover and catch up without losing committed data so that the cluster can heal after restarts. **Why this priority**: Ensures durability across restarts and supports rolling maintenance. **Independent Test**: Commit an entry, stop a follower, commit another entry, restart the follower; verify it restores state and applies all committed entries. **Acceptance Scenarios**: 1. **Given** a follower stopped after entry N is committed, **When** the cluster commits entry N+1 while it is down, **Then** on restart the follower installs both entries in order. 2. **Given** divergent logs on restart, **When** leader sends AppendEntries, **Then** follower truncates/aligns to leader and preserves committed suffix. --- ### Edge Cases - Leader crash immediately after commit but before followers apply. - Network partition isolating a minority vs. majority; minority must not commit new entries. - Log holes or conflicting terms on recovery must be reconciled to leader’s log. ## Requirements *(mandatory)* ### Functional Requirements - **FR-001**: The system MUST support single-node leader election and proposal handling without external coordination. - **FR-002**: The system MUST replicate log entries to a majority in a 3-node cluster before marking them committed. - **FR-003**: The system MUST persist log entries, hard state (term, vote), and conf state to durable storage so that restarts preserve committed progress. - **FR-004**: The system MUST apply committed entries to the underlying storage engine in log order without gaps. - **FR-005**: The system MUST prevent a node in a minority partition from committing new entries while isolated. - **FR-006**: On restart, a node MUST reconcile its log with the leader (truncate/append) to match the committed log and reapply missing committed entries. - **FR-007**: For this phase, operate a fixed 3-node membership (no dynamic add/remove), but architecture must allow future extension to scale out safely. ### Key Entities - **Peer**: A Raft node with ID, region scope, in-memory state machine, and access to durable Raft storage. - **Raft Log Entry**: Indexed record containing term and opaque command bytes; persisted and replicated. - **Hard State**: Term, vote, commit index persisted to ensure safety across restarts. - **Conf State**: Voter set defining the quorum for replication. ## Success Criteria *(mandatory)* ### Measurable Outcomes - **SC-001**: Single-node bootstraps and accepts a proposal within 2 seconds, committing it and persisting the entry. - **SC-002**: In a 3-node cluster, a committed entry is present on at least two nodes within 3 seconds of proposal. - **SC-003**: After a follower restart, all previously committed entries are restored and applied in order within 5 seconds of rejoining a healthy leader. - **SC-004**: During a minority partition, isolated nodes do not advance commit index or apply uncommitted entries.