FC1: now asserts HasActiveSession() after address change AND
verifies session_created in log (not just plan_cancelled).
FC4: escalation event detail must be >15 chars (contains proof
reason with LSN values, not just "needs_rebuild").
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
P2 tests now force conditions instead of observing them:
FC3: Real WAL scan verified directly — StreamWALEntries transfers
real entries from disk (head=5, transferred=5). Engine planning also
verified (ZeroGap in V1 interim documented).
FC4: ForceFlush advances checkpoint/tail to 20. Replica at 0 is
below tail → NeedsRebuild with proof: "gap_beyond_retention: need
LSN 1 but tail=20". No early return.
FC5: ForceFlush advances checkpoint to 10. Assertive:
- replica at checkpoint=10 → ZeroGap (V1 interim)
- replica at 0 → NeedsRebuild (below tail, not CatchUp)
FC1/FC2: Labeled as integrated engine/storage (control simulated).
New: BlockVol.ForceFlush() — triggers synchronous flusher cycle for
test use. Advances checkpoint + WAL tail deterministically.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5 failure-class replay tests against real file-backed BlockVol,
exercising the full integrated path:
bridge adapter → v2bridge reader/pinner → engine planner/executor
FC1: Changed-address restart — identity preserved, old plan cancelled,
new session created. Log shows plan_cancelled + session_created.
FC2: Stale epoch after failover — sessions invalidated at old epoch,
new assignment at epoch 2 creates fresh session. Log shows
per-replica invalidation.
FC3: Real catch-up (pre-checkpoint) — engine classifies from real
RetainedHistory, zero-gap in V1 interim (committed=0 before flush).
Documents the V1 limitation explicitly.
FC4: Unrecoverable gap — after flush, if checkpoint advances, replica
behind tail gets NeedsRebuild. Documents that V1 unit test may
not advance checkpoint (flusher timing).
FC5: Post-checkpoint boundary — replica at checkpoint = zero-gap in
V1 interim. Explicitly documents the catch-up collapse boundary.
go.mod: added replace directives for sw-block engine + bridge modules.
Carry-forward (explicit):
- CommittedLSN = CheckpointLSN (V1 interim)
- FC3/FC4/FC5 limited by flusher not advancing checkpoint in unit tests
- Executor snapshot/full-base/truncate still stubs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7 tests in weed/storage/blockvol/v2bridge/bridge_test.go:
Reader (2 tests):
- StatusSnapshot reads real nextLSN, WALCheckpointLSN, flusher state
- HeadLSN advances with real writes
Pinner (2 tests):
- HoldWALRetention: hold tracked, MinWALRetentionFloor reports position,
release clears hold
- HoldRejectsRecycled: validates against real WAL tail
Executor (2 tests):
- StreamWALEntries: real ScanFrom reads WAL entries from disk
- StreamPartialRange: partial range scan works
Stubs (1 test):
- TransferSnapshot/TransferFullBase/TruncateWAL return not-implemented
All tests use createTestVol (1MB file-backed BlockVol with 256KB WAL).
No mock/push adapters — direct real blockvol instances.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Finding 1: WALTailLSN semantic fix
- StatusSnapshot().WALTailLSN now reads super.WALCheckpointLSN (an LSN)
- Was: wal.Tail() which returns a physical byte offset
- Entries with LSN > WALTailLSN are guaranteed in the WAL
Finding 2: ScanWALEntries replay-source fix
- ScanWALEntries passes super.WALCheckpointLSN as the recycled boundary
- Was: flusher.CheckpointLSN() which in V1 equals CommittedLSN
- The flusher's live checkpoint may advance in memory, but entries above
the durable superblock checkpoint are still physically in the WAL
- Normal catch-up (replica at 70, committed at 100) now works because
fromLSN=71 > super.WALCheckpointLSN (which is the last persisted
checkpoint, not the live flusher state)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pinner (pinner.go):
- HoldWALRetention: validates startLSN >= current tail, tracks hold
- HoldSnapshot: validates checkpoint exists + trusted
- HoldFullBase: tracks hold by ID
- MinWALRetentionFloor: returns minimum held position across all
WAL/snapshot holds — designed for flusher RetentionFloorFn hookup
- Release functions remove holds from tracking map
Executor (executor.go):
- StreamWALEntries: validates range against real WAL tail/head
(actual ScanFrom integration deferred to network-layer wiring)
- TransferSnapshot/TransferFullBase/TruncateWAL: stubs for P1
Key integration points:
- Pinner reads real StatusSnapshot for validation
- Pinner.MinWALRetentionFloor can wire into flusher.RetentionFloorFn
- Executor validates WAL range availability from real state
Carry-forward:
- Real ScanFrom wiring needs WAL fd + offset (network layer)
- TransferSnapshot/TransferFullBase need extent I/O
- Control intent from confirmed failover (master-side)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CatchUpExecutor.OnStep: optional callback fired between executor-managed
progress steps. Enables deterministic fault injection (epoch bump)
between steps without racing or manual sender calls.
E2_EpochBump_MidExecutorLoop:
- Executor runs 5 progress steps
- OnStep hook bumps epoch after step 1 (after 2 successful steps)
- Executor's own loop detects invalidation at step 2's check
- Resources released by executor's release path (not manual cancel)
- Log shows session_invalidated + exec_resources_released
This closes the remaining FC2 gap: invalidation is now detected
and cleaned up by the executor itself, not by external code.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Planner/executor contract:
- RebuildExecutor.Execute() takes no arguments — consumes plan-bound
RebuildSource, RebuildSnapshotLSN, RebuildTargetLSN
- RecoveryPlan binds all rebuild targets at plan time
- Executor cannot re-derive policy from caller-supplied history
Catch-up timing:
- Removed unused completeTick parameter from CatchUpExecutor.Execute
- Per-step ticks synthesized as startTick + stepIndex + 1
- API shape matches implementation
New test: PlanExecuteConsistency_RebuildCannotSwitchSource
- Plans snapshot+tail, then mutates storage history
- Executor succeeds using plan-bound values (not re-derived)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Full-base rebuild resource:
- StorageAdapter.PinFullBase/ReleaseFullBase for full-extent base image
- PlanRebuild full_base branch now acquires FullBasePin
- RecoveryPlan.FullBasePin field, released by ReleasePlan
Session cleanup on resource failure:
- PlanRecovery invalidates session when WAL pin fails
(no dangling live session after failed resource acquisition)
3 new tests:
- PlanRebuild_FullBase_PinsBaseImage: pin acquired + released
- PlanRebuild_FullBase_PinFailure: logged + error
- PlanRecovery_WALPinFailure_CleansUpSession: session invalidated,
sender disconnected (no dangling state)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ProcessAssignment now compares pre/post endpoint state before
logging session_invalidated with "endpoint_changed" reason.
Normal session supersede (same endpoint, assignment_intent) no
longer mislabeled as endpoint change.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Zero-gap completion:
- ExecuteRecovery auto-completes zero-gap sessions (no sender call needed)
- RecoveryResult.FinalState = StateInSync for zero-gap
Epoch transition:
- UpdateSenderEpoch: orchestrator-owned epoch advancement with auto-log
- InvalidateEpoch: per-replica session_invalidated events (not aggregate)
Endpoint-change invalidation:
- ProcessAssignment detects session ID change from endpoint update
- Logs per-replica session_invalidated with "endpoint_changed" reason
All integration tests now use orchestrator exclusively for core lifecycle.
No direct sender API calls for recovery execution in integration tests.
1 new test: EndpointChange_LogsInvalidation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
RecordHandshakeFromHistory and SelectRebuildFromHistory now
return an error instead of panicking on nil history input.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New file: history.go — RetainedHistory connects recovery decisions
to actual WAL retention state:
- IsRecoverable: checks gap against tail/head boundaries
- MakeHandshakeResult: generates HandshakeResult from retention state
- RebuildSourceDecision: chooses snapshot+tail vs full base from
checkpoint state (trusted vs untrusted)
- ProveRecoverability: generates explicit proof explaining why
recovery is or is not allowed
14 new tests (recoverability_test.go):
- Recoverable/unrecoverable gap (exact boundary, beyond head)
- Trusted/untrusted/no checkpoint → rebuild source selection
- Handshake from retained history → outcome classification
- Recoverability proofs (zero-gap, ahead, within retention, beyond)
- E2E: two replicas driven by retained history (catch-up + rebuild)
- Truncation required for replica ahead of committed
Engine module at 44 tests (12 + 18 + 14).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entry counting:
- Session.setRange now initializes recoveredTo = startLSN
- RecordCatchUpProgress delta counts only actual catch-up work
(recoveredTo - startLSN), not the replica's pre-existing prefix
Rebuild transfer gate:
- BeginTailReplay requires TransferredTo >= SnapshotLSN
- Prevents tail replay on incomplete base transfer
3 new regression tests:
- BudgetEntries_NonZeroStart_CountsOnlyDelta (30 entries within 50 budget)
- BudgetEntries_NonZeroStart_ExceedsBudget (30 entries exceeds 20 budget)
- Rebuild_PartialTransfer_BlocksTailReplay
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Registry is now keyed by stable ReplicaID, not by address.
DataAddr changes preserve sender identity — the core V2 invariant.
Changes:
- ReplicaAssignment{ReplicaID, Endpoint} replaces map[string]Endpoint
- AssignmentIntent.Replicas uses []ReplicaAssignment
- Registry.Reconcile takes []ReplicaAssignment
- Tests use stable IDs ("replica-1", "r1") independent of addresses
New test: ChangedDataAddr_PreservesSenderIdentity
- Same ReplicaID, different DataAddr (10.0.0.1 → 10.0.0.2)
- Sender pointer preserved, session invalidated, new session attached
- This is the exact V1/V1.5 regression that V2 must fix
doc.go: clarified Slice 1 core vs carried-forward files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All mutable state on Sender and Session is now unexported:
- Sender.state, .epoch, .endpoint, .session, .stopped → accessors
- Session.id, .phase, .kind, etc. → read-only accessors
- Session() replaced by SessionSnapshot() (returns disconnected copy)
- SessionID() and HasActiveSession() for common queries
- AttachSession returns (sessionID, error) not (*Session, error)
- SupersedeSession returns sessionID not *Session
Budget configuration via SessionOption:
- WithBudget(CatchUpBudget) passed to AttachSession
- No direct field mutation on session from external code
New test: Encapsulation_SnapshotIsReadOnly proves snapshot
mutation does not leak back to sender state.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Frozen target is now unconditional:
- FrozenTargetLSN field on RecoverySession, set by BeginCatchUp
- RecordCatchUpProgress enforces FrozenTargetLSN regardless of Budget
- Catch-up is always a bounded (R, H0] contract
Rebuild completion exclusivity:
- CompleteSessionByID explicitly rejects SessionRebuild by kind
- Rebuild sessions can ONLY complete via CompleteRebuild
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
IsRecoverable now verifies three conditions:
- startExclusive >= tailLSN (not recycled)
- endInclusive <= headLSN (within WAL)
- all LSNs in range exist contiguously (no holes)
StateAt now uses base snapshot captured during AdvanceTail:
- returns nil for LSNs before snapshot boundary (unreconstructable)
- correctly includes block state from recycled entries via snapshot
5 new tests: end-beyond-head, missing entries, state after tail
advance, nil before snapshot, block last written before tail.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AllocateBlockVolumeResponse used bs.ListenAddr() to derive replica
addresses. When the VS binds to ":port" (no explicit IP), host
resolved to empty string, producing ":dataPort" as the replica
address. This ":port" propagated through master assignments to both
primary and replica sides.
Now canonicalizes empty/wildcard host using PreferredOutboundIP()
before constructing replication addresses. Also exported
PreferredOutboundIP for use by the server package.
This is the source fix — all downstream paths (heartbeat, API
response, assignment) inherit the canonical address.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
setupReplicaReceiver now reads back canonical addresses from
the ReplicaReceiver (which applies CP13-2 canonicalization)
instead of storing raw assignment addresses in replStates.
This fixes the API-level leak where replica_data_addr showed
":port" instead of "ip:port" in /block/volumes responses,
even though the engine-level CP13-2 fix was working.
New BlockVol.ReplicaReceiverAddr() returns canonical addresses
from the running receiver. Falls back to assignment addresses
if receiver didn't report.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
rebuildFullExtent updated superblock.WALCheckpointLSN but not the
flusher's internal checkpointLSN. NewReplicaReceiver then read
stale 0 from flusher.CheckpointLSN(), causing post-rebuild
flushedLSN to be wrong.
Added Flusher.SetCheckpointLSN() and call it after rebuild
superblock persist. TestRebuild_PostRebuild_FlushedLSN_IsCheckpoint
flips FAIL→PASS.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test used createSyncAllPair(t) but discarded the replica
return value, leaving the volume file open. On Windows this
caused TempDir cleanup failure. All 7 CP13-1 baseline FAILs
now PASS.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds per-replica state reporting in heartbeat so master can identify
which specific replica needs rebuild, not just a volume-level boolean.
New ReplicaShipperStatus{DataAddr, State, FlushedLSN} type reported
via ReplicaShipperStates field on BlockVolumeInfoMessage. Populated
from ShipperGroup.ShipperStates() on each heartbeat. Scales to RF=3+.
V1 constraints (explicit):
- NeedsRebuild cleared only by control-plane reassignment (no local exit)
- Post-rebuild replica re-enters as Disconnected/bootstrap, not InSync
- flushedLSN = checkpointLSN after rebuild (durable baseline only)
4 new tests: heartbeat per-replica state, NeedsRebuild reporting,
rebuild-complete-reenters-InSync (full cycle), epoch mismatch abort.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Flusher now holds WAL entries needed by recoverable replicas.
Both AdvanceTail (physical space) and checkpointLSN (scan gate)
are gated by the minimum flushed LSN across catch-up-eligible
replicas.
New methods on ShipperGroup:
- MinRecoverableFlushedLSN() (uint64, bool): pure read, returns
min flushed LSN across InSync/Degraded/Disconnected/CatchingUp
replicas with known progress. Excludes NeedsRebuild.
- EvaluateRetentionBudgets(timeout): separate mutation step,
escalates replicas that exceed walRetentionTimeout (5m default)
to NeedsRebuild, releasing their WAL hold.
Flusher integration: evaluates budgets then queries floor on each
flush cycle. If floor < maxLSN, holds both checkpoint and tail.
Extent writes proceed normally (reads work), only WAL reclaim
is deferred.
LastContactTime on WALShipper: updated on barrier success,
handshake success, and catch-up completion. Not on Ship (TCP
write only). Avoids misclassifying idle-but-healthy replicas.
CP13-6 ships with timeout budget only. walRetentionMaxBytes
is deferred (documented as partial slice).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Concurrent WriteLBA/Trim calls could deliver WAL entries to replicas
out of LSN order: two goroutines allocate LSN 4 and 5 concurrently,
but LSN 5 could reach the replica first via ShipAll, causing the
replica to reject it as an LSN gap.
shipMu now wraps nextLSN.Add + wal.Append + ShipAll in both
WriteLBA and Trim, guaranteeing LSN-ordered delivery to replicas
under concurrent writers.
The dirty map update and WAL pressure check happen after shipMu
is released — they don't need ordering guarantees.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
doReconnectAndCatchUp() now uses the replicaFlushedLSN returned by
the reconnect handshake as the catch-up start point, not the
shipper's stale cached value. The replica may have less durable
progress than the shipper last knew.
ReplicaReceiver initialization: flushedLSN now set from the
volume's checkpoint LSN (durable by definition), not nextLSN
(which includes unflushed entries). receivedLSN still uses
nextLSN-1 since those entries are in the WAL buffer even if
not yet synced.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>