photoncloud-monorepo/flaredb/specs/002-raft-features/spec.md
centra 8f94aee1fa Fix R8: Convert submodule gitlinks to regular directories
- Remove gitlinks (160000 mode) for chainfire, flaredb, iam
- Add workspace contents as regular tracked files
- Update flake.nix to use simple paths instead of builtins.fetchGit

This resolves the nix build failure where submodule directories
appeared empty in the nix store.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 16:51:20 +09:00

92 lines
4.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Feature Specification: Raft Core Replication
**Feature Branch**: `002-raft-features`
**Created**: 2025-12-01
**Status**: Draft
**Input**: User description: "Raft関連の機能についてお願いします。"
## Clarifications
### Session 2025-12-01
- Q: Should this phase assume fixed 3-node membership or include dynamic membership? → A: Fixed 3-node, extensible for future scaling.
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Single-Node Raft Baseline (Priority: P1)
As a platform engineer, I want a single-node Raft instance to accept proposals, elect a leader, and persist committed entries so I can validate the log/storage plumbing before scaling out.
**Why this priority**: Establishes correctness of log append/apply and persistence; blocks multi-node rollout.
**Independent Test**: Start one node, trigger self-election, propose an entry, verify it is committed and applied to storage with the expected data.
**Acceptance Scenarios**:
1. **Given** a single node started fresh, **When** it campaigns, **Then** it becomes leader and can accept proposals.
2. **Given** a proposed entry "e1", **When** it commits, **Then** storage contains "e1" and last index increments by 1.
---
### User Story 2 - Multi-Node Replication (Priority: P1)
As a platform engineer, I want a 3-node Raft cluster to replicate entries to a majority so that writes remain durable under follower failure.
**Why this priority**: Majority replication is the core availability guarantee of Raft.
**Independent Test**: Start 3 nodes, elect a leader, propose an entry; verify leader and at least one follower store the entry at the same index/term and report commit.
**Acceptance Scenarios**:
1. **Given** a 3-node cluster, **When** a leader is elected, **Then** at least two nodes acknowledge commit for the same index/term.
2. **Given** a committed entry on the leader, **When** one follower is stopped, **Then** the other follower still receives and persists the entry.
---
### User Story 3 - Failure and Recovery (Priority: P2)
As an operator, I want a stopped follower to recover and catch up without losing committed data so that the cluster can heal after restarts.
**Why this priority**: Ensures durability across restarts and supports rolling maintenance.
**Independent Test**: Commit an entry, stop a follower, commit another entry, restart the follower; verify it restores state and applies all committed entries.
**Acceptance Scenarios**:
1. **Given** a follower stopped after entry N is committed, **When** the cluster commits entry N+1 while it is down, **Then** on restart the follower installs both entries in order.
2. **Given** divergent logs on restart, **When** leader sends AppendEntries, **Then** follower truncates/aligns to leader and preserves committed suffix.
---
### Edge Cases
- Leader crash immediately after commit but before followers apply.
- Network partition isolating a minority vs. majority; minority must not commit new entries.
- Log holes or conflicting terms on recovery must be reconciled to leaders log.
## Requirements *(mandatory)*
### Functional Requirements
- **FR-001**: The system MUST support single-node leader election and proposal handling without external coordination.
- **FR-002**: The system MUST replicate log entries to a majority in a 3-node cluster before marking them committed.
- **FR-003**: The system MUST persist log entries, hard state (term, vote), and conf state to durable storage so that restarts preserve committed progress.
- **FR-004**: The system MUST apply committed entries to the underlying storage engine in log order without gaps.
- **FR-005**: The system MUST prevent a node in a minority partition from committing new entries while isolated.
- **FR-006**: On restart, a node MUST reconcile its log with the leader (truncate/append) to match the committed log and reapply missing committed entries.
- **FR-007**: For this phase, operate a fixed 3-node membership (no dynamic add/remove), but architecture must allow future extension to scale out safely.
### Key Entities
- **Peer**: A Raft node with ID, region scope, in-memory state machine, and access to durable Raft storage.
- **Raft Log Entry**: Indexed record containing term and opaque command bytes; persisted and replicated.
- **Hard State**: Term, vote, commit index persisted to ensure safety across restarts.
- **Conf State**: Voter set defining the quorum for replication.
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: Single-node bootstraps and accepts a proposal within 2 seconds, committing it and persisting the entry.
- **SC-002**: In a 3-node cluster, a committed entry is present on at least two nodes within 3 seconds of proposal.
- **SC-003**: After a follower restart, all previously committed entries are restored and applied in order within 5 seconds of rejoining a healthy leader.
- **SC-004**: During a minority partition, isolated nodes do not advance commit index or apply uncommitted entries.