Browse Source
feat: replica-aware WAL retention (CP13-6)
feat: replica-aware WAL retention (CP13-6)
Flusher now holds WAL entries needed by recoverable replicas. Both AdvanceTail (physical space) and checkpointLSN (scan gate) are gated by the minimum flushed LSN across catch-up-eligible replicas. New methods on ShipperGroup: - MinRecoverableFlushedLSN() (uint64, bool): pure read, returns min flushed LSN across InSync/Degraded/Disconnected/CatchingUp replicas with known progress. Excludes NeedsRebuild. - EvaluateRetentionBudgets(timeout): separate mutation step, escalates replicas that exceed walRetentionTimeout (5m default) to NeedsRebuild, releasing their WAL hold. Flusher integration: evaluates budgets then queries floor on each flush cycle. If floor < maxLSN, holds both checkpoint and tail. Extent writes proceed normally (reads work), only WAL reclaim is deferred. LastContactTime on WALShipper: updated on barrier success, handshake success, and catch-up completion. Not on Ship (TCP write only). Avoids misclassifying idle-but-healthy replicas. CP13-6 ships with timeout budget only. walRetentionMaxBytes is deferred (documented as partial slice). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>feature/sw-block
4 changed files with 136 additions and 15 deletions
-
28weed/storage/blockvol/blockvol.go
-
54weed/storage/blockvol/flusher.go
-
53weed/storage/blockvol/shipper_group.go
-
16weed/storage/blockvol/wal_shipper.go
Write
Preview
Loading…
Cancel
Save
Reference in new issue