photoncloud-monorepo/docs/por/T041-chainfire-cluster-join-fix/openraft-issue.md
centra d2149b6249 fix(lightningstor): Fix SigV4 canonicalization for AWS S3 auth
- Replace form_urlencoded with RFC 3986 compliant URI encoding
- Implement aws_uri_encode() matching AWS SigV4 spec exactly
- Unreserved chars (A-Z,a-z,0-9,-,_,.,~) not encoded
- All other chars percent-encoded with uppercase hex
- Preserve slashes in paths, encode in query params
- Normalize empty paths to '/' per AWS spec
- Fix test expectations (body hash, HMAC values)
- Add comprehensive SigV4 signature determinism test

This fixes the canonicalization mismatch that caused signature
validation failures in T047. Auth can now be enabled for production.

Refs: T058.S1
2025-12-12 06:23:46 +09:00

2.8 KiB

OpenRaft GitHub Issue - To Be Filed

Repository: https://github.com/databendlabs/openraft/issues/new


Bug: Assertion failure upto >= log_id_range.prev during learner replication

Version

  • openraft: 0.9.21
  • Rust: 1.91.1
  • OS: Linux

Description

When adding a learner to a single-node Raft cluster and attempting to replicate logs, OpenRaft panics with an assertion failure in debug builds. In release builds, the assertion is skipped but the replication hangs indefinitely.

Assertion Location

openraft-0.9.21/src/progress/inflight/mod.rs:178
assertion failed: upto >= log_id_range.prev

Reproduction Steps

  1. Bootstrap a single-node cluster (node 1)
  2. Start a second node configured as a learner (not bootstrapped)
  3. Call add_learner(node_id=2, node=BasicNode::default(), blocking=true) from the leader
  4. The add_learner succeeds
  5. During subsequent replication/heartbeat to the learner, panic occurs

Minimal Reproduction Code

// Leader node (bootstrapped)
let raft = Raft::new(1, config, network, log_store, sm).await?;
raft.initialize(btreemap!{1 => BasicNode::default()}).await?;

// Wait for leader election
sleep(Duration::from_secs(2)).await;

// Add learner (second node is running but not bootstrapped)
raft.add_learner(2, BasicNode::default(), true).await?; // Succeeds

// Panic occurs here during replication to learner
// Either during add_learner's blocking wait or subsequent heartbeats

Expected Behavior

The learner should receive AppendEntries from the leader and catch up with the log without assertion failures.

Actual Behavior

  • Debug build: Panic with assertion failed: upto >= log_id_range.prev
  • Release build: No panic, but replication hangs indefinitely (suggests undefined behavior)

Feature Flags Tested

  • loosen-follower-log-revert - No effect on this assertion

Analysis

The assertion debug_assert!(upto >= log_id_range.prev) in the ack method validates that acknowledgments are monotonically increasing within the replication window.

The failure suggests that when a new learner is added, the progress tracking state may not be properly initialized, causing the first acknowledgment to violate this invariant.

This appears related to (but different from) the fix in #584/#585, which addressed value > prev in progress/mod.rs. This assertion is in progress/inflight/mod.rs.

Environment

[dependencies]
openraft = { version = "0.9", features = ["serde", "storage-v2", "loosen-follower-log-revert"] }

Additional Context

  • Single-node to multi-node cluster expansion via dynamic membership
  • Learner node has empty log state (never bootstrapped)
  • Leader is already initialized with some log entries

File this issue at: https://github.com/databendlabs/openraft/issues/new