Tree:
512bb5bcf6
add-ec-vacuum
add-filer-iam-grpc
add-iam-grpc-management
add_fasthttp_client
add_remote_storage
adding-message-queue-integration-tests
adjust-fsck-cutoff-default
admin/csrf-s3tables
allow-no-role-arn
also-delete-parent-directory-if-empty
avoid_releasing_temp_file_on_write
cautious-dinosaur
changing-to-zap
coderabbitai/autofix/fafd849
codex-rust-volume-server-bootstrap
codex/8712-directory-marker-content-type
codex/admin-oidc-auth-ui
codex/cache-iam-policy-engines
codex/ec-repair-worker
codex/erasure-coding-shard-distribution
codex/list-object-versions-newest-first
codex/s3tables-maint-lifecycle-parity
codex/s3tables-maint-planner-multispec
codex/s3tables-maintenance-designs
collect-public-metrics
copilot/fix-helm-chart-installation
copilot/fix-s3-object-tagging-issue
copilot/make-renew-interval-configurable
copilot/make-renew-interval-configurable-again
copilot/sub-pr-7677
create-table-snapshot-api-design
data_query_pushdown
dependabot/maven/other/java/client/com.google.protobuf-protobuf-java-3.25.5
dependabot/maven/other/java/examples/org.apache.hadoop-hadoop-common-3.4.0
detect-and-plan-ec-tasks
do-not-retry-if-error-is-NotFound
ec-disk-type-support
enhance-erasure-coding
expand-the-s3-PutObject-permission-to-the-multipart-permissions
fasthttp
feat/mount-showSystemEntries
feature-8113-storage-class-disk-routing
feature/mini-port-detection
feature/modernize-s3-tests
feature/s3-multi-cert-support
feature/s3tables-improvements-and-spark-tests
feature/sra-uds-handler
feature/sw-block
filer1_maintenance_branch
fix-8303-s3-lifecycle-ttl-assign
fix-GetObjectLockConfigurationHandler
fix-bucket-name-case-7910
fix-helm-fromtoml-compatibility
fix-mount-http-parallelism
fix-mount-read-throughput-7504
fix-pr-7909
fix-s3-configure-consistency
fix-s3-object-tagging-issue-7589
fix-sts-session-token-7941
fix-versioning-listing-only
fix/8712-directory-markers-content-type
fix/iceberg-stage-create-semantics
fix/lock-table-shared-lock-precedence
fix/mount-cache-consistency
fix/object-lock-delete-enforcement
fix/plugin-ui-remove-scheduler-settings
fix/s3-conditional-headers-toctou-race
fix/s3-delete-directory-marker-non-empty
fix/sts-body-preservation
fix/subscribe-metadata-slow-consumer-blocked
fix/windows-test-file-cleanup
ftp
gh-pages
has-weed-sql-command
iam-multi-file-migration
iam-permissions-and-api
improve-fuse-mount
improve-fuse-mount2
lifecycle/pr1-evaluator
logrus
master
mercury-dance
message_send
moored-spoon
mount2
mq-subscribe
mq2
nfs-cookie-prefix-list-fixes
optimize-delete-lookups
original_weed_mount
plugin-system-phase1
plugin-ui-enhancements-restored
pr-7412
pr/7984
pr/8140
pr/8680
raft-dual-write
random_access_file
refactor-needle-read-operations
refactor-volume-write
remote_overlay
remove-claude-ci
remove-implicit-directory-handling
revert-5134-patch-1
revert-5819-patch-1
revert-6434-bugfix-missing-s3-audit
s3-remote-cache-singleflight
s3-select
s3tables-by-claude
scheduler-sequential-iteration
sub
tcp_read
test-reverting-lock-table
test_udp
testing
testing-sdx-generation
tikv
track-mount-e2e
upgrade-versions-to-4.00
volume_buffered_writes
worker-execute-ec-tasks
0.72
0.72.release
0.73
0.74
0.75
0.76
0.77
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1.00
1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.10
1.11
1.12
1.14
1.15
1.16
1.17
1.18
1.19
1.20
1.21
1.22
1.23
1.24
1.25
1.26
1.27
1.28
1.29
1.30
1.31
1.32
1.33
1.34
1.35
1.36
1.37
1.38
1.40
1.41
1.42
1.43
1.44
1.45
1.46
1.47
1.48
1.49
1.50
1.51
1.52
1.53
1.54
1.55
1.56
1.57
1.58
1.59
1.60
1.61
1.61RC
1.62
1.63
1.64
1.65
1.66
1.67
1.68
1.69
1.70
1.71
1.72
1.73
1.74
1.75
1.76
1.77
1.78
1.79
1.80
1.81
1.82
1.83
1.84
1.85
1.86
1.87
1.88
1.90
1.91
1.92
1.93
1.94
1.95
1.96
1.97
1.98
1.99
1;70
2.00
2.01
2.02
2.03
2.04
2.05
2.06
2.07
2.08
2.09
2.10
2.11
2.12
2.13
2.14
2.15
2.16
2.17
2.18
2.19
2.20
2.21
2.22
2.23
2.24
2.25
2.26
2.27
2.28
2.29
2.30
2.31
2.32
2.33
2.34
2.35
2.36
2.37
2.38
2.39
2.40
2.41
2.42
2.43
2.47
2.48
2.49
2.50
2.51
2.52
2.53
2.54
2.55
2.56
2.57
2.58
2.59
2.60
2.61
2.62
2.63
2.64
2.65
2.66
2.67
2.68
2.69
2.70
2.71
2.72
2.73
2.74
2.75
2.76
2.77
2.78
2.79
2.80
2.81
2.82
2.83
2.84
2.85
2.86
2.87
2.88
2.89
2.90
2.91
2.92
2.93
2.94
2.95
2.96
2.97
2.98
2.99
3.00
3.01
3.02
3.03
3.04
3.05
3.06
3.07
3.08
3.09
3.10
3.11
3.12
3.13
3.14
3.15
3.16
3.18
3.19
3.20
3.21
3.22
3.23
3.24
3.25
3.26
3.27
3.28
3.29
3.30
3.31
3.32
3.33
3.34
3.35
3.36
3.37
3.38
3.39
3.40
3.41
3.42
3.43
3.44
3.45
3.46
3.47
3.48
3.50
3.51
3.52
3.53
3.54
3.55
3.56
3.57
3.58
3.59
3.60
3.61
3.62
3.63
3.64
3.65
3.66
3.67
3.68
3.69
3.71
3.72
3.73
3.74
3.75
3.76
3.77
3.78
3.79
3.80
3.81
3.82
3.83
3.84
3.85
3.86
3.87
3.88
3.89
3.90
3.91
3.92
3.93
3.94
3.95
3.96
3.97
3.98
3.99
4.00
4.01
4.02
4.03
4.04
4.05
4.06
4.07
4.08
4.09
4.12
4.13
4.15
4.16
4.17
dev
helm-3.65.1
v0.69
v0.70beta
v3.33
${ noResults }
13 Commits (512bb5bcf6178e1663dcafb9c75637cdb1c25205)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
512bb5bcf6 |
fix: orchestrator owns full catch-up contract (budget + truncation)
CompleteCatchUp now integrates: - BeginCatchUp with start tick (freezes target) - RecordCatchUpProgress (skips if already converged, e.g., truncation-only) - CheckBudget at completion tick (escalates to NeedsRebuild + logs) - RecordTruncation before completion (logs truncation_recorded) - Logs causal reason for every rejection/escalation CatchUpOptions: StartTick/CompleteTick (separate) + TruncateLSN. 3 new orchestrator-level tests: - ReplicaAhead_TruncateViaOrchestrator: truncation through entry path - ReplicaAhead_NoTruncate_CompletionRejected: logs completion_rejected - BudgetEscalation_ViaOrchestrator: budget violation → NeedsRebuild + logs Observability tests relabeled as sender-level (not entry-path). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
2 days ago |
|
|
adaff8ddb3 |
fix: only log endpoint_changed when endpoint actually changed
ProcessAssignment now compares pre/post endpoint state before logging session_invalidated with "endpoint_changed" reason. Normal session supersede (same endpoint, assignment_intent) no longer mislabeled as endpoint change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
2 days ago |
|
|
5cdee4a011 |
fix: orchestrator owns zero-gap completion and per-replica invalidation logging
Zero-gap completion: - ExecuteRecovery auto-completes zero-gap sessions (no sender call needed) - RecoveryResult.FinalState = StateInSync for zero-gap Epoch transition: - UpdateSenderEpoch: orchestrator-owned epoch advancement with auto-log - InvalidateEpoch: per-replica session_invalidated events (not aggregate) Endpoint-change invalidation: - ProcessAssignment detects session ID change from endpoint update - Logs per-replica session_invalidated with "endpoint_changed" reason All integration tests now use orchestrator exclusively for core lifecycle. No direct sender API calls for recovery execution in integration tests. 1 new test: EndpointChange_LogsInvalidation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
3 days ago |
|
|
47238df0d7 |
fix: add RecoveryOrchestrator as real integrated entry path
New: orchestrator.go — RecoveryOrchestrator drives recovery lifecycle from assignment through execution to completion/escalation: - ProcessAssignment: reconcile + session creation + auto-log - ExecuteRecovery: connect → handshake from RetainedHistory → outcome - CompleteCatchUp: begin catch-up → progress → complete + auto-log - CompleteRebuild: connect → handshake → history-driven source → transfer → tail replay → complete + auto-log - InvalidateEpoch: invalidate stale sessions + auto-log All integration tests rewritten to use orchestrator as entry path. No direct sender API calls in recovery lifecycle. SessionSnapshot now includes: TruncateRequired/ToLSN/Recorded, RebuildSource, RebuildPhase. RecoveryLog is auto-populated by orchestrator at every transition. 7 integration tests via orchestrator: - ChangedAddress, NeedsRebuild→Rebuild, EpochBump, MultiReplica - Observability: session snapshot, rebuild snapshot, auto-populated log Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
3 days ago |
|
|
7436b3b79c |
feat: add integration closure and observability (Phase 05 Slice 4)
New files: - observe.go: RegistryStatus, SenderStatus, RecoveryLog for debugging - integration_test.go: V2-boundary integration tests through real engine entry path Observability: - Registry.Status() returns full snapshot: per-sender state, session snapshots, counts by category (InSync, Recovering, Rebuilding) - RecoveryLog: append-only event log for recovery lifecycle debugging Integration tests (6): - ChangedAddress_FullFlow: initial recovery → address change → sender preserved → new session → recovery with proof - NeedsRebuild_ThenRebuildAssignment: catch-up fails → NeedsRebuild → rebuild assignment → history-driven source → InSync - EpochBump_DuringRecovery: mid-recovery epoch bump → old session rejected → new assignment at new epoch → InSync - MultiReplica_MixedOutcomes: 3 replicas, 3 outcomes via RetainedHistory proofs, registry status verified - RegistryStatus_Snapshot: observability snapshot structure - RecoveryLog: event recording and filtering Engine module at 54 tests (12 + 18 + 18 + 6). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
3 days ago |
|
|
4d06622c01 |
fix: add nil check for RetainedHistory in sender APIs
RecordHandshakeFromHistory and SelectRebuildFromHistory now return an error instead of panicking on nil history input. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
3 days ago |
|
|
cc8c529962 |
fix: connect recovery decisions to RetainedHistory, fix rebuild source
RetainedHistory as engine input: - RecordHandshakeFromHistory: sender-level API consuming RetainedHistory directly, returns RecoverabilityProof alongside outcome - SelectRebuildFromHistory: sender-level API consuming RetainedHistory for rebuild-source decision RebuildSourceDecision soundness: - Now requires BOTH trusted checkpoint AND replayable tail (CheckpointLSN >= TailLSN and CommittedLSN <= HeadLSN) - Trusted checkpoint with unreplayable tail falls back to full_base 4 new tests: - TrustedCheckpoint_UnreplayableTail (the regression case) - SenderDriven_CatchUp (history → proof → outcome → complete) - SenderDriven_Rebuild_SnapshotTail (history → source → rebuild) - SenderDriven_Rebuild_FallsBackToFullBase (unreplayable tail) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
3 days ago |
|
|
ff7ea41099 |
feat: add engine data/recoverability core (Phase 05 Slice 3)
New file: history.go — RetainedHistory connects recovery decisions to actual WAL retention state: - IsRecoverable: checks gap against tail/head boundaries - MakeHandshakeResult: generates HandshakeResult from retention state - RebuildSourceDecision: chooses snapshot+tail vs full base from checkpoint state (trusted vs untrusted) - ProveRecoverability: generates explicit proof explaining why recovery is or is not allowed 14 new tests (recoverability_test.go): - Recoverable/unrecoverable gap (exact boundary, beyond head) - Trusted/untrusted/no checkpoint → rebuild source selection - Handshake from retained history → outcome classification - Recoverability proofs (zero-gap, ahead, within retention, beyond) - E2E: two replicas driven by retained history (catch-up + rebuild) - Truncation required for replica ahead of committed Engine module at 44 tests (12 + 18 + 14). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
3 days ago |
|
|
368a956aee |
fix: correct catch-up entry counting and rebuild transfer gate
Entry counting: - Session.setRange now initializes recoveredTo = startLSN - RecordCatchUpProgress delta counts only actual catch-up work (recoveredTo - startLSN), not the replica's pre-existing prefix Rebuild transfer gate: - BeginTailReplay requires TransferredTo >= SnapshotLSN - Prevents tail replay on incomplete base transfer 3 new regression tests: - BudgetEntries_NonZeroStart_CountsOnlyDelta (30 entries within 50 budget) - BudgetEntries_NonZeroStart_ExceedsBudget (30 entries exceeds 20 budget) - Rebuild_PartialTransfer_BlocksTailReplay Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
3 days ago |
|
|
930de4ba78 |
feat: add Slice 2 recovery execution tests (Phase 05)
15 new engine-level recovery execution tests: - Zero-gap / catch-up / needs-rebuild branching (3 tests) - Stale execution rejection during active recovery (2 tests) - Bounded catch-up: frozen target, duration, entries, stall (5 tests) - Completion before convergence rejected - Rebuild exclusivity: catch-up APIs excluded (1 test) - Rebuild lifecycle: snapshot+tail, full base, stale ID (3 tests) - Assignment-driven recovery flow Engine module now at 27 tests (12 Slice 1 + 15 Slice 2). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
3 days ago |
|
|
61e9408261 |
fix: separate stable ReplicaID from Endpoint in registry
Registry is now keyed by stable ReplicaID, not by address.
DataAddr changes preserve sender identity — the core V2 invariant.
Changes:
- ReplicaAssignment{ReplicaID, Endpoint} replaces map[string]Endpoint
- AssignmentIntent.Replicas uses []ReplicaAssignment
- Registry.Reconcile takes []ReplicaAssignment
- Tests use stable IDs ("replica-1", "r1") independent of addresses
New test: ChangedDataAddr_PreservesSenderIdentity
- Same ReplicaID, different DataAddr (10.0.0.1 → 10.0.0.2)
- Sender pointer preserved, session invalidated, new session attached
- This is the exact V1/V1.5 regression that V2 must fix
doc.go: clarified Slice 1 core vs carried-forward files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
3 days ago |
|
|
bb24b4b039 |
fix: encapsulate engine sender/session authority state
All mutable state on Sender and Session is now unexported: - Sender.state, .epoch, .endpoint, .session, .stopped → accessors - Session.id, .phase, .kind, etc. → read-only accessors - Session() replaced by SessionSnapshot() (returns disconnected copy) - SessionID() and HasActiveSession() for common queries - AttachSession returns (sessionID, error) not (*Session, error) - SupersedeSession returns sessionID not *Session Budget configuration via SessionOption: - WithBudget(CatchUpBudget) passed to AttachSession - No direct field mutation on session from external code New test: Encapsulation_SnapshotIsReadOnly proves snapshot mutation does not leak back to sender state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
3 days ago |
|
|
20d70f9fb6 |
feat: add V2 engine replication core (Phase 05 Slice 1)
Creates sw-block/engine/replication/ — the real V2 engine ownership core, promoted from sw-block/prototype/enginev2/ with all accepted invariants. Files: - types.go: Endpoint, ReplicaState, SessionKind, SessionPhase, FSM transitions - sender.go: per-replica Sender with full execution + rebuild APIs - session.go: Session with identity, phases, frozen target, truncation, budget - registry.go: Registry with reconcile + assignment intent + epoch invalidation - budget.go: CatchUpBudget (duration, entries, stall detection) - rebuild.go: RebuildState FSM (snapshot+tail vs full base) - outcome.go: HandshakeResult + ClassifyRecoveryOutcome Tests (ownership_test.go, 13 tests): - Changed-address invalidation (A10) - Stale session ID rejected at all APIs (A3) - Stale completion after supersede (A3) - Epoch bump invalidates all sessions (A3) - Stale assignment epoch rejected - Rebuild exclusivity (catch-up APIs rejected) - Rebuild full lifecycle - Frozen target rejects chase (A5) - Budget violation escalates (A5) - E2E: 3 replicas, 3 outcomes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
3 days ago |