Tree:
2ff4a07544
add-ec-vacuum
add-filer-iam-grpc
add-iam-grpc-management
add_fasthttp_client
add_remote_storage
adding-message-queue-integration-tests
adjust-fsck-cutoff-default
admin/csrf-s3tables
allow-no-role-arn
also-delete-parent-directory-if-empty
avoid_releasing_temp_file_on_write
changing-to-zap
codex-rust-volume-server-bootstrap
codex/admin-oidc-auth-ui
codex/cache-iam-policy-engines
codex/ec-repair-worker
codex/erasure-coding-shard-distribution
codex/list-object-versions-newest-first
collect-public-metrics
copilot/fix-helm-chart-installation
copilot/fix-s3-object-tagging-issue
copilot/make-renew-interval-configurable
copilot/make-renew-interval-configurable-again
copilot/sub-pr-7677
create-table-snapshot-api-design
data_query_pushdown
dependabot/maven/other/java/client/com.google.protobuf-protobuf-java-3.25.5
dependabot/maven/other/java/examples/org.apache.hadoop-hadoop-common-3.4.0
detect-and-plan-ec-tasks
do-not-retry-if-error-is-NotFound
ec-disk-type-support
enhance-erasure-coding
expand-the-s3-PutObject-permission-to-the-multipart-permissions
fasthttp
feature-8113-storage-class-disk-routing
feature/iceberg-data-compaction
feature/mini-port-detection
feature/modernize-s3-tests
feature/s3-multi-cert-support
feature/s3tables-improvements-and-spark-tests
feature/sra-uds-handler
feature/sw-block
filer1_maintenance_branch
fix-8303-s3-lifecycle-ttl-assign
fix-GetObjectLockConfigurationHandler
fix-bucket-name-case-7910
fix-helm-fromtoml-compatibility
fix-mount-http-parallelism
fix-mount-read-throughput-7504
fix-pr-7909
fix-s3-configure-consistency
fix-s3-object-tagging-issue-7589
fix-sts-session-token-7941
fix-versioning-listing-only
fix/iceberg-stage-create-semantics
fix/mount-cache-consistency
fix/object-lock-delete-enforcement
fix/plugin-ui-remove-scheduler-settings
fix/sts-body-preservation
fix/windows-test-file-cleanup
ftp
gh-pages
has-weed-sql-command
iam-multi-file-migration
iam-permissions-and-api
improve-fuse-mount
improve-fuse-mount2
logrus
master
message_send
mount2
mq-subscribe
mq2
nfs-cookie-prefix-list-fixes
optimize-delete-lookups
original_weed_mount
plugin-system-phase1
plugin-ui-enhancements-restored
pr-7412
pr/7984
pr/8140
raft-dual-write
random_access_file
refactor-needle-read-operations
refactor-volume-write
remote_overlay
remove-implicit-directory-handling
revert-5134-patch-1
revert-5819-patch-1
revert-6434-bugfix-missing-s3-audit
rust-volume-server
s3-remote-cache-singleflight
s3-select
s3tables-by-claude
scheduler-sequential-iteration
sub
tcp_read
test-reverting-lock-table
test_udp
testing
testing-sdx-generation
tikv
track-mount-e2e
upgrade-versions-to-4.00
volume_buffered_writes
worker-execute-ec-tasks
0.72
0.72.release
0.73
0.74
0.75
0.76
0.77
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1.00
1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.10
1.11
1.12
1.14
1.15
1.16
1.17
1.18
1.19
1.20
1.21
1.22
1.23
1.24
1.25
1.26
1.27
1.28
1.29
1.30
1.31
1.32
1.33
1.34
1.35
1.36
1.37
1.38
1.40
1.41
1.42
1.43
1.44
1.45
1.46
1.47
1.48
1.49
1.50
1.51
1.52
1.53
1.54
1.55
1.56
1.57
1.58
1.59
1.60
1.61
1.61RC
1.62
1.63
1.64
1.65
1.66
1.67
1.68
1.69
1.70
1.71
1.72
1.73
1.74
1.75
1.76
1.77
1.78
1.79
1.80
1.81
1.82
1.83
1.84
1.85
1.86
1.87
1.88
1.90
1.91
1.92
1.93
1.94
1.95
1.96
1.97
1.98
1.99
1;70
2.00
2.01
2.02
2.03
2.04
2.05
2.06
2.07
2.08
2.09
2.10
2.11
2.12
2.13
2.14
2.15
2.16
2.17
2.18
2.19
2.20
2.21
2.22
2.23
2.24
2.25
2.26
2.27
2.28
2.29
2.30
2.31
2.32
2.33
2.34
2.35
2.36
2.37
2.38
2.39
2.40
2.41
2.42
2.43
2.47
2.48
2.49
2.50
2.51
2.52
2.53
2.54
2.55
2.56
2.57
2.58
2.59
2.60
2.61
2.62
2.63
2.64
2.65
2.66
2.67
2.68
2.69
2.70
2.71
2.72
2.73
2.74
2.75
2.76
2.77
2.78
2.79
2.80
2.81
2.82
2.83
2.84
2.85
2.86
2.87
2.88
2.89
2.90
2.91
2.92
2.93
2.94
2.95
2.96
2.97
2.98
2.99
3.00
3.01
3.02
3.03
3.04
3.05
3.06
3.07
3.08
3.09
3.10
3.11
3.12
3.13
3.14
3.15
3.16
3.18
3.19
3.20
3.21
3.22
3.23
3.24
3.25
3.26
3.27
3.28
3.29
3.30
3.31
3.32
3.33
3.34
3.35
3.36
3.37
3.38
3.39
3.40
3.41
3.42
3.43
3.44
3.45
3.46
3.47
3.48
3.50
3.51
3.52
3.53
3.54
3.55
3.56
3.57
3.58
3.59
3.60
3.61
3.62
3.63
3.64
3.65
3.66
3.67
3.68
3.69
3.71
3.72
3.73
3.74
3.75
3.76
3.77
3.78
3.79
3.80
3.81
3.82
3.83
3.84
3.85
3.86
3.87
3.88
3.89
3.90
3.91
3.92
3.93
3.94
3.95
3.96
3.97
3.98
3.99
4.00
4.01
4.02
4.03
4.04
4.05
4.06
4.07
4.08
4.09
4.12
4.13
4.15
4.16
4.17
dev
helm-3.65.1
v0.69
v0.70beta
v3.33
${ noResults }
13101 Commits (2ff4a0754472b16e6d67e1ea43190780f7c025f4)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
2ff4a07544
|
Reduce task logger glog noise and remove per-write fsync (#8603)
* Reduce task logger noise: stop duplicating every log entry to glog and stderr Every task log entry was being tripled: written to the task log file, forwarded to glog (which writes to /tmp by default with no rotation), and echoed to stderr. This caused glog files to fill /tmp on long-running workers. - Remove INFO/DEBUG forwarding to glog (only ERROR/WARNING remain) - Remove stderr echo of every log line - Remove fsync on every single log write (unnecessary for log files) * Fix glog call depth for correct source file attribution The call stack is: caller → Error() → log() → writeLogEntry() → glog.ErrorDepth(), so depth=4 is needed for glog to report the original caller's file and line number. |
3 days ago |
|
|
1df6821ec6
|
Fix topologySpreadConstraints key in sftp-deployment.yaml (#8600)
|
3 days ago |
|
|
4a5243886a |
4.17
|
3 days ago |
|
|
e1e4c9437a
|
fix(s3api): ListObjects with trailing-slash prefix matches sibling directories (#8599)
fix(s3api): ListObjects with trailing-slash prefix returns wrong results When ListObjectsV2 is called with a prefix ending in "/" (e.g., "foo/"), normalizePrefixMarker strips the trailing slash and splits into dir="parent" and prefix="foo". The filer then lists entries matching prefix "foo", which returns both directory "foo" and "foo1000". The prefixEndsOnDelimiter guard correctly identifies directory "foo" as the target and recurses into it, but then resets the guard to false. The loop continues and incorrectly recurses into "foo1000" as well, causing the listing to return objects from unrelated directories. Fix: after recursing into the exact directory targeted by the trailing-slash prefix, return immediately from the listing loop. There is no reason to process sibling entries since the original prefix specifically targeted one directory. |
3 days ago |
|
|
f950a941e3
|
Fix trust policy validation for specific AWS user principals (#8597)
* Add tests for AWS user principal in AssumeRole trust policies Add test cases that verify trust policy validation when using specific AWS user principals (e.g., "arn:aws:iam::000000000000:user/backend") in the Principal field of trust policies for AssumeRole. Covers single user, multiple users (array), wildcard, and plain string principal formats. These tests demonstrate the bug reported in #8588 where specific user principals always fail validation. * Populate RequestContext in ValidateTrustPolicyForPrincipal ValidateTrustPolicyForPrincipal was creating an EvaluationContext with a nil RequestContext. The policy engine's principal matching logic looks up "aws:PrincipalArn" in RequestContext for non-wildcard principals, so specific user ARNs like "arn:aws:iam::000000000000:user/backend" always failed to match, while wildcard "*" worked because it short-circuits before the lookup. Populate RequestContext with both "principal" and "aws:PrincipalArn" keys, consistent with how IsActionAllowed already does it. Fixes #8588 * Remove GitHub discussion URL from source code comments * Add specific error message assertions in trust policy tests |
4 days ago |
|
|
ac579c1746
|
Fix plugin configuration tab layout overflow (#8596)
Fix plugin configuration tab layout overflow (#8587) Remove h-100 from Job Scheduling Settings card, which caused it to stretch to 100% of the row height and push the Next Run card below the row boundary, overflowing into the Detection Results section. |
4 days ago |
|
|
0a5c5ed4ce
|
Persist S3 bucket counter metrics across idle periods (#8595)
* Stop deleting counter metrics during bucket TTL cleanup Counter metrics (traffic bytes, request counts, object counts) are monotonically increasing by design. Deleting them after 10 minutes of bucket inactivity causes them to vanish from /metrics output and reset to zero when traffic resumes, breaking Prometheus rate()/increase() queries and making historical traffic reporting impossible. Only delete gauges and histograms in the TTL cleanup loop, as these represent current state and are safely re-populated on next activity. Fixes https://github.com/seaweedfs/seaweedfs/issues/8521 * Clean up all bucket metrics on bucket deletion Add DeleteBucketMetrics() to delete all metrics (including counters) for a bucket when it is explicitly deleted. This prevents unbounded label cardinality from accumulating for buckets that no longer exist. Called from DeleteBucketHandler after successful bucket deletion. * Reduce mutex scope in bucket metrics TTL sweep Collect expired bucket names under the lock, then release before calling DeletePartialMatch on Prometheus metrics. This prevents RecordBucketActiveTime from blocking during the expensive cleanup. |
4 days ago |
|
|
0a2dac1e56 |
Reduce mutex scope in bucket metrics TTL sweep
Collect expired bucket names under the lock, then release before calling DeletePartialMatch on Prometheus metrics. This prevents RecordBucketActiveTime from blocking during the expensive cleanup. |
4 days ago |
|
|
737116e83c |
fix port probing
|
4 days ago |
|
|
b20eae697e |
Merge branch 'master' of https://github.com/seaweedfs/seaweedfs
|
4 days ago |
|
|
07f3f5eec5 |
remove worker links
|
4 days ago |
|
|
47cad59c70
|
Remove misleading Workers sub-menu items from admin sidebar (#8594)
* Remove misleading Workers sub-menu items from admin sidebar The sidebar sub-items (Job Detection, Job Queue, Job Execution, Configuration) always navigated to the first job type's tabs (typically EC Encoding) rather than showing cross-job-type views. This was confusing as noted in #8590. Since the in-page tabs already provide this navigation, remove the redundant sidebar sub-items and keep only the top-level Workers link. Fixes #8590 * Update layout_templ.go |
4 days ago |
|
|
b17e2b411a
|
Add dynamic timeouts to plugin worker vacuum gRPC calls (#8593)
* add dynamic timeouts to plugin worker vacuum gRPC calls All vacuum gRPC calls used context.Background() with no deadline, so the plugin scheduler's execution timeout could kill a job while a large volume compact was still in progress. Use volume-size-scaled timeouts matching the topology vacuum approach: 3 min/GB for compact, 1 min/GB for check, commit, and cleanup. Fixes #8591 * scale scheduler execution timeout by volume size The scheduler's per-job execution timeout (default 240s) would kill vacuum jobs on large volumes before they finish. Three changes: 1. Vacuum detection now includes estimated_runtime_seconds in job proposals, computed as 5 min/GB of volume size. 2. The scheduler checks for estimated_runtime_seconds in job parameters and uses it as the execution timeout when larger than the default — a generic mechanism any handler can use. 3. Vacuum task gRPC calls now use the passed-in ctx as parent instead of context.Background(), so scheduler cancellation propagates to in-flight RPCs. * extend job type runtime when proposals need more time The JobTypeMaxRuntime (default 30 min) wraps both detection and execution. Its context is the parent of all per-job execution contexts, so even with per-job estimated_runtime_seconds, jobCtx would cancel everything when it expires. After detection, scan proposals for the maximum estimated_runtime_seconds. If any proposal needs more time than the remaining JobTypeMaxRuntime, create a new execution context with enough headroom. This lets large vacuum jobs complete without being killed by the job type deadline while still respecting the configured limit for normal-sized jobs. * log missing volume size metric, remove dead minimum runtime guard Add a debug log in vacuumTimeout when t.volumeSize is 0 so operators can investigate why metrics are missing for a volume. Remove the unreachable estimatedRuntimeSeconds < 180 check in buildVacuumProposal — volumeSizeGB always >= 1 (due to +1 floor), so estimatedRuntimeSeconds is always >= 300. * cap estimated runtime and fix status check context - Cap maxEstimatedRuntime and per-job timeout overrides to 8 hours to prevent unbounded timeouts from bad metrics. - Check execCtx.Err() instead of jobCtx.Err() for status reporting, since dispatch runs under execCtx which may have a longer deadline. A successful dispatch under execCtx was misreported as "timeout" when jobCtx had expired. |
4 days ago |
|
|
4c88fbfd5e
|
Fix nil pointer crash during concurrent vacuum compaction (#8592)
* check for nil needle map before compaction sync
When CommitCompact runs concurrently, it sets v.nm = nil under
dataFileAccessLock. CompactByIndex does not hold that lock, so
v.nm.Sync() can hit a nil pointer. Add an early nil check to
return an error instead of crashing.
Fixes #8591
* guard copyDataBasedOnIndexFile size check against nil needle map
The post-compaction size validation at line 538 accesses
v.nm.ContentSize() and v.nm.DeletedSize(). If CommitCompact has
concurrently set v.nm to nil, this causes a SIGSEGV. Skip the
validation when v.nm is nil since the actual data copy uses local
needle maps (oldNm/newNm) and is unaffected.
Fixes #8591
* use atomic.Bool for compaction flags to prevent concurrent vacuum races
The isCompacting and isCommitCompacting flags were plain bools
read and written from multiple goroutines without synchronization.
This allowed concurrent vacuums on the same volume to pass the
guard checks and run simultaneously, leading to the nil pointer
crash. Using atomic.Bool with CompareAndSwap ensures only one
compaction or commit can run per volume at a time.
Fixes #8591
* use go-version-file in CI workflows instead of hardcoded versions
Use go-version-file: 'go.mod' so CI automatically picks up the Go
version from go.mod, avoiding future version drift. Reordered
checkout before setup-go in go.yml and e2e.yml so go.mod is
available. Removed the now-unused GO_VERSION env vars.
* capture v.nm locally in CompactByIndex to close TOCTOU race
A bare nil check on v.nm followed by v.nm.Sync() has a race window
where CommitCompact can set v.nm = nil between the two. Snapshot
the pointer into a local variable so the nil check and Sync operate
on the same reference.
* add dynamic timeouts to plugin worker vacuum gRPC calls
All vacuum gRPC calls used context.Background() with no deadline,
so the plugin scheduler's execution timeout could kill a job while
a large volume compact was still in progress. Use volume-size-scaled
timeouts matching the topology vacuum approach: 3 min/GB for compact,
1 min/GB for check, commit, and cleanup.
Fixes #8591
* Revert "add dynamic timeouts to plugin worker vacuum gRPC calls"
This reverts commit
|
4 days ago |
|
|
d4d2e511ed |
for mini, default to bind all
|
5 days ago |
|
|
3d9f7f6f81 |
go 1.25
|
5 days ago |
|
|
d89a78d9e3 |
reduce logs
|
5 days ago |
|
|
00000ec006 |
Update s3_buckets_templ.go
|
5 days ago |
|
|
1bd7a98a4a |
simplify plugin scheduler: remove configurable IdleSleepSeconds, use constant 61s
The SchedulerConfig struct and its persistence/API were unnecessary indirection. Replace with a simple constant (reduced from 613s to 61s) so the scheduler re-checks for detectable job types promptly after going idle, improving the clean-install experience. |
5 days ago |
|
|
8ad58e7002 |
4.16
|
5 days ago |
|
|
f220328ae4 |
test: assert ReadFileStatusCount in batch execution test
Verify that pre-delete verification called ReadVolumeFileStatus on both source and target for each volume move. |
5 days ago |
|
|
cf3693651c |
fix: add IdxFileSize check to pre-delete volume verification
The verification step checked DatFileSize and FileCount but not IdxFileSize, leaving a gap in the copy validation before source deletion. |
5 days ago |
|
|
5f85bf5e8a
|
Batch volume balance: run multiple moves per job (#8561)
* proto: add BalanceMoveSpec and batch fields to BalanceTaskParams Add BalanceMoveSpec message for encoding individual volume moves, and max_concurrent_moves + repeated moves fields to BalanceTaskParams to support batching multiple volume moves in a single job. * balance handler: add batch execution with concurrent volume moves Refactor Execute() into executeSingleMove() (backward compatible) and executeBatchMoves() which runs multiple volume moves concurrently using a semaphore-bounded goroutine pool. When BalanceTaskParams.Moves is populated, the batch path is taken; otherwise the single-move path. Includes aggregate progress reporting across concurrent moves, per-move error collection, and partial failure support. * balance handler: add batch config fields to Descriptor and worker config Add max_concurrent_moves and batch_size fields to the worker config form and deriveBalanceWorkerConfig(). These control how many volume moves run concurrently within a batch job and the maximum batch size. * balance handler: group detection proposals into batch jobs When batch_size > 1, the Detect method groups detection results into batch proposals where each proposal encodes multiple BalanceMoveSpec entries in BalanceTaskParams.Moves. Single-result batches fall back to the existing single-move proposal format for backward compatibility. * admin UI: add volume balance execution plan and batch badge Add renderBalanceExecutionPlan() for rich rendering of volume balance jobs in the job detail modal. Single-move jobs show source/target/volume info; batch jobs show a moves table with all volume moves. Add batch badge (e.g., "5 moves") next to job type in the execution jobs table when the job has batch=true label. * Update plugin_templ.go * fix: detection algorithm uses greedy target instead of divergent topology scores The detection loop tracked effective volume counts via an adjustments map, but createBalanceTask independently called planBalanceDestination which used the topology's LoadCount — a separate, unadjusted source of truth. This divergence caused multiple moves to pile onto the same server. Changes: - Add resolveBalanceDestination to resolve the detection loop's greedy target (minServer) rather than independently picking a destination - Add oscillation guard: stop when max-min <= 1 since no single move can improve the balance beyond that point - Track unseeded destinations: if a target server wasn't in the initial serverVolumeCounts, add it so subsequent iterations include it - Add TestDetection_UnseededDestinationDoesNotOverload * fix: handler force_move propagation, partial failure, deterministic dedupe - Propagate ForceMove from outer BalanceTaskParams to individual move TaskParams so batch moves respect the force_move flag - Fix partial failure: mark job successful if at least one move succeeded (succeeded > 0 || failed == 0) to avoid re-running already-completed moves on retry - Use SHA-256 hash for deterministic dedupe key fallback instead of time.Now().UnixNano() which is non-deterministic - Remove unused successDetails variable - Extract maxProposalStringLength constant to replace magic number 200 * admin UI: use template literals in balance execution plan rendering * fix: integration test handles batch proposals from batched detection With batch_size=20, all moves are grouped into a single proposal containing BalanceParams.Moves instead of top-level Sources/Targets. Update assertions to handle both batch and single-move proposal formats. * fix: verify volume size on target before deleting source during balance Add a pre-delete safety check that reads the volume file status on both source and target, then compares .dat file size and file count. If they don't match, the move is aborted — leaving the source intact rather than risking irreversible data loss. Also removes the redundant mountVolume call since VolumeCopy already mounts the volume on the target server. * fix: clamp maxConcurrent, serialize progress sends, validate config as int64 - Clamp maxConcurrentMoves to defaultMaxConcurrentMoves before creating the semaphore so a stale or malicious job cannot request unbounded concurrent volume moves - Extend progressMu to cover sender.SendProgress calls since the underlying gRPC stream is not safe for concurrent writes - Perform bounds checks on max_concurrent_moves and batch_size in int64 space before casting to int, avoiding potential overflow on 32-bit * fix: check disk capacity in resolveBalanceDestination Skip disks where VolumeCount >= MaxVolumeCount so the detection loop does not propose moves to a full disk that would fail at execution time. * test: rename unseeded destination test to match actual behavior The test exercises a server with 0 volumes that IS seeded from topology (matching disk type), not an unseeded destination. Rename to TestDetection_ZeroVolumeServerIncludedInBalance and fix comments. * test: tighten integration test to assert exactly one batch proposal With default batch_size=20, all moves should be grouped into a single batch proposal. Assert len(proposals)==1 and require BalanceParams with Moves, removing the legacy single-move else branch. * fix: propagate ctx to RPCs and restore source writability on abort - All helper methods (markVolumeReadonly, copyVolume, tailVolume, readVolumeFileStatus, deleteVolume) now accept a context parameter instead of using context.Background(), so Execute's ctx propagates cancellation and timeouts into every volume server RPC - Add deferred cleanup that restores the source volume to writable if any step after markVolumeReadonly fails, preventing the source from being left permanently readonly on abort - Add markVolumeWritable helper using VolumeMarkWritableRequest * fix: deep-copy protobuf messages in test recording sender Use proto.Clone in recordingExecutionSender to store immutable snapshots of JobProgressUpdate and JobCompleted, preventing assertions from observing mutations if the handler reuses message pointers. * fix: add VolumeMarkWritable and ReadVolumeFileStatus to fake volume server The balance task now calls ReadVolumeFileStatus for pre-delete verification and VolumeMarkWritable to restore writability on abort. Add both RPCs to the test fake, and drop the mountCalls assertion since BalanceTask no longer calls VolumeMount directly (VolumeCopy handles it). * fix: use maxConcurrentMovesLimit (50) for clamp, not defaultMaxConcurrentMoves defaultMaxConcurrentMoves (5) is the fallback when the field is unset, not an upper bound. Clamping to it silently overrides valid config values like 10/20/50. Introduce maxConcurrentMovesLimit (50) matching the descriptor's MaxValue and clamp to that instead. * fix: cancel batch moves on progress stream failure Derive a cancellable batchCtx from the caller's ctx. If sender.SendProgress returns an error (client disconnect, context cancelled), capture it, skip further sends, and cancel batchCtx so in-flight moves abort via their propagated context rather than running blind to completion. * fix: bound cleanup timeout and validate batch move fields - Use a 30-second timeout for the deferred markVolumeWritable cleanup instead of context.Background() which can block indefinitely if the volume server is unreachable - Validate required fields (VolumeID, SourceNode, TargetNode) before appending moves to a batch proposal, skipping invalid entries - Fall back to a single-move proposal when filtering leaves only one valid move in a batch * fix: cancel task execution on SendProgress stream failure All handler progress callbacks previously ignored SendProgress errors, allowing tasks to continue executing after the client disconnected. Now each handler creates a derived cancellable context and cancels it on the first SendProgress error, stopping the in-flight task promptly. Handlers fixed: erasure_coding, vacuum, volume_balance (single-move), and admin_script (breaks command loop on send failure). * fix: validate batch moves before scheduling in executeBatchMoves Reject empty batches, enforce a hard upper bound (100 moves), and filter out nil or incomplete move specs (missing source/target/volume) before allocating progress tracking and launching goroutines. * test: add batch balance execution integration test Tests the batch move path with 3 volumes, max concurrency 2, using fake volume servers. Verifies all moves complete with correct readonly, copy, tail, and delete RPC counts. * test: add MarkWritableCount and ReadFileStatusCount accessors Expose the markWritableCalls and readFileStatusCalls counters on the fake volume server, following the existing MarkReadonlyCount pattern. * fix: oscillation guard uses global effective counts for heterogeneous capacity The oscillation guard (max-min <= 1) previously used maxServer/minServer which are determined by utilization ratio. With heterogeneous capacity, maxServer by utilization can have fewer raw volumes than minServer, producing a negative diff and incorrectly triggering the guard. Now scans all servers' effective counts to find the true global max/min volume counts, so the guard works correctly regardless of whether utilization-based or raw-count balancing is used. * fix: admin script handler breaks outer loop on SendProgress failure The break on SendProgress error inside the shell.Commands scan only exited the inner loop, letting the outer command loop continue executing commands on a broken stream. Use a sendBroken flag to propagate the break to the outer execCommands loop. |
5 days ago |
|
|
b991acf634
|
fix: paginate bucket listing in Admin UI to show all buckets (#8585)
* fix: paginate bucket listing in Admin UI to show all buckets The Admin UI's GetS3Buckets() had a hardcoded Limit of 1000 in the ListEntries request, causing the Total Buckets count to cap at 1000 even when more buckets exist. This adds pagination to iterate through all buckets by continuing from the last entry name when a full page is returned. Fixes seaweedfs/seaweedfs#8564 * feat: add server-side pagination and sorting to S3 buckets page Add pagination controls, page size selector, and sortable column headers to the Admin UI's Object Store buckets page, following the same pattern used by the Cluster Volumes page. This ensures the UI remains responsive with thousands of buckets. - Add CurrentPage, TotalPages, PageSize, SortBy, SortOrder to S3BucketsData - Accept page/pageSize/sortBy/sortOrder query params in ShowS3Buckets handler - Sort buckets by name, owner, created, objects, logical/physical size - Paginate results server-side (default 100 per page) - Add pagination nav, page size dropdown, and sort indicators to template * Update s3_buckets_templ.go * Update object_store_users_templ.go * fix: use errors.Is(err, io.EOF) instead of string comparison Replace brittle err.Error() == "EOF" string comparison with idiomatic errors.Is(err, io.EOF) for checking stream end in bucket listing. * fix: address PR review findings for bucket pagination - Clamp page to totalPages when page exceeds total, preventing empty results with misleading pagination state - Fix sort comparator to use explicit ascending/descending comparisons with a name tie-breaker, satisfying strict weak ordering for sort.Slice - Capture SnapshotTsNs from first ListEntries response and pass it to subsequent requests for consistent pagination across pages - Replace non-focusable <th onclick> sort headers with <a> tags and reuse getSortIcon, matching the cluster_volumes accessibility pattern - Change exportBucketList() to fetch all buckets from /api/s3/buckets instead of scraping DOM rows (which now only contain the current page) |
5 days ago |
|
|
02d3e3195c |
Update object_store_users_templ.go
|
5 days ago |
|
|
470075dd90
|
admin/balance: fix Max Volumes display and balancer source selection (#8583)
* admin: fix Max Volumes column always showing 0 GetClusterVolumeServers() computed DiskCapacity from diskInfo.MaxVolumeCount but never populated the MaxVolumes field on the VolumeServer struct, causing the column to always display 0. * balance: use utilization ratio for source server selection The balancer selected the source server (to move volumes FROM) by raw volume count. In clusters with heterogeneous MaxVolumeCount settings, the server with the highest capacity naturally holds the most volumes and was always picked as the source, even when it had the lowest utilization ratio. Change source selection and imbalance calculation to use utilization ratio (effectiveCount / maxVolumeCount) so servers are compared by how full they are relative to their capacity, not by absolute volume count. This matches how destination scoring already works via calculateBalanceScore(). |
5 days ago |
|
|
f8b7357350
|
weed/server: fix dropped error (#8584)
* weed/server: fix dropped error * Removed the redundant check. --------- Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> Co-authored-by: Chris Lu <chris.lu@gmail.com> |
5 days ago |
|
|
e1c4faba38
|
build(deps): bump org.apache.zookeeper:zookeeper from 3.9.4 to 3.9.5 in /test/java/spark (#8580)
* build(deps): bump org.apache.zookeeper:zookeeper in /test/java/spark Bumps org.apache.zookeeper:zookeeper from 3.9.4 to 3.9.5. --- updated-dependencies: - dependency-name: org.apache.zookeeper:zookeeper dependency-version: 3.9.5 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> * fix: use go-version-file instead of hardcoded Go version in CI workflows The hardcoded go-version '1.24' is too old for go.mod which requires go >= 1.25.0, causing build failures in Spark integration tests. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Chris Lu <chris.lu@gmail.com> |
5 days ago |
|
|
6c7fe87a72
|
helm: add s3.tlsSecret for custom S3 HTTPS certificate (#8582)
* helm: add s3.tlsSecret to allow custom TLS certificate for S3 HTTPS endpoint Allow users to specify an external Kubernetes TLS secret for the S3 HTTPS endpoint instead of using the internal self-signed client certificate. This enables using publicly trusted certificates (e.g. from Let's Encrypt) so S3 clients don't need to trust the internal CA. The new s3.tlsSecret value is supported in the standalone S3 gateway, filer with embedded S3, and all-in-one deployment templates. Closes #8581 * refactor: extract S3 TLS helpers to reduce duplication Move repeated S3 TLS cert/key logic into shared helper templates (seaweedfs.s3.tlsArgs, seaweedfs.s3.tlsVolumeMount, seaweedfs.s3.tlsVolume) in _helpers.tpl, and use them across all three deployment templates. * helm: add allInOne.s3.trafficDistribution support Add the missing allInOne.s3.trafficDistribution branch to the seaweedfs.trafficDistribution helper and wire it into the all-in-one service template, mirroring the existing s3-service.yaml behavior. PreferClose is auto-converted to PreferSameZone on k8s >=1.35. * fix: scope S3 TLS mounts to S3-enabled pods and simplify trafficDistribution helper - Wrap S3 TLS volume/volumeMount includes in allInOne.s3.enabled and filer.s3.enabled guards so the custom TLS secret is only mounted when S3 is actually enabled in that deployment mode. - Refactor seaweedfs.trafficDistribution helper to accept an explicit value+Capabilities dict instead of walking multiple .Values paths, making each call site responsible for passing its own setting. |
5 days ago |
|
|
b3d32fe73b |
fix go version
|
5 days ago |
|
|
f439c84d01
|
build(deps): bump github.com/aws/aws-sdk-go-v2 from 1.41.1 to 1.41.3 (#8576)
Bumps [github.com/aws/aws-sdk-go-v2](https://github.com/aws/aws-sdk-go-v2) from 1.41.1 to 1.41.3. - [Release notes](https://github.com/aws/aws-sdk-go-v2/releases) - [Commits](https://github.com/aws/aws-sdk-go-v2/compare/v1.41.1...v1.41.3) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go-v2 dependency-version: 1.41.3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> |
5 days ago |
|
|
89f1096c0e |
Update ec-integration.yml
|
5 days ago |
|
|
6dab90472b
|
admin: fix access key creation UX (#8579)
* admin: remove misleading "secret key only shown once" warning
The access key details modal already allows viewing both the access key
and secret key at any time, so the warning about the secret key only
being displayed once is incorrect and misleading.
* admin: allow specifying custom access key and secret key
Add optional access_key and secret_key fields to the create access key
API. When provided, the specified keys are used instead of generating
random ones. The UI now shows a form with optional fields when creating
a new key, with a note that leaving them blank auto-generates keys.
* admin: check access key uniqueness before creating
Access keys must be globally unique across all users since S3 auth
looks them up in a single global map. Add an explicit check using
GetUserByAccessKey before creating, so the user gets a clear error
("access key is already in use") rather than a generic store error.
* Update object_store_users_templ.go
* admin: address review feedback for access key creation
Handler:
- Use decodeJSONBody/newJSONMaxReader instead of raw json.Decode to
enforce request size limits and handle malformed JSON properly
- Return 409 Conflict for duplicate access keys, 400 Bad Request for
validation errors, instead of generic 500
Backend:
- Validate access key length (4-128 chars) and secret key length
(8-128 chars) when user-provided
Frontend:
- Extract resetCreateKeyForm() helper to avoid duplicated cleanup logic
- Wire resetCreateKeyForm to accessKeysModal hidden.bs.modal event so
form state is always cleared when modal is dismissed
- Change secret key input to type="password" with a visibility toggle
* admin: guard against nil request and handle GetUserByAccessKey errors
- Add nil check for the CreateAccessKeyRequest pointer before
dereferencing, defaulting to an empty request (auto-generate both
keys).
- Handle non-"not found" errors from GetUserByAccessKey explicitly
instead of silently proceeding, so store errors (e.g. db connection
failures) surface rather than being swallowed.
* Update object_store_users_templ.go
* admin: fix access key uniqueness check with gRPC store
GetUserByAccessKey returns a gRPC NotFound status error (not the
sentinel credential.ErrAccessKeyNotFound) when using the gRPC store,
causing the uniqueness check to fail with a spurious error.
Treat the lookup as best-effort: only reject when a user is found
(err == nil). Any error (not-found via any store, connectivity issues)
falls through to the store's own CreateAccessKey which enforces
uniqueness definitively.
* admin: fix error handling and input validation for access key creation
Backend:
- Remove access key value from the duplicate-key error message to avoid
logging the caller-supplied identifier.
Handler:
- Handle empty POST body (io.EOF) as a valid request that auto-generates
both keys, instead of rejecting it as malformed JSON.
- Return 404 for "not found" errors (e.g. non-existent user) instead of
collapsing them into a 500.
Frontend:
- Add minlength/maxlength attributes matching backend constraints
(access key 4-128, secret key 8-128).
- Call reportValidity() before submitting so invalid lengths are caught
client-side without a round trip.
* admin: use sentinel errors and fix GetUserByAccessKey error handling
Backend (user_management.go):
- Define sentinel errors (ErrAccessKeyInUse, ErrUserNotFound,
ErrInvalidInput) and wrap them in returned errors so callers can use
errors.Is.
- Handle GetUserByAccessKey errors properly: check the sentinel
credential.ErrAccessKeyNotFound first, then fall back to string
matching for stores (gRPC) that return non-sentinel not-found errors.
Surface unexpected errors instead of silently proceeding.
Handler (user_handlers.go):
- Replace fragile strings.Contains error matching with errors.Is
against the new dash sentinels.
Frontend (object_store_users.templ):
- Add double-submit guard (isCreatingKey flag + button disabling) to
prevent duplicate access key creation requests.
|
5 days ago |
|
|
a00d38d8d4
|
build(deps): bump go.mongodb.org/mongo-driver from 1.17.6 to 1.17.9 (#8575)
Bumps [go.mongodb.org/mongo-driver](https://github.com/mongodb/mongo-go-driver) from 1.17.6 to 1.17.9. - [Release notes](https://github.com/mongodb/mongo-go-driver/releases) - [Commits](https://github.com/mongodb/mongo-go-driver/compare/v1.17.6...v1.17.9) --- updated-dependencies: - dependency-name: go.mongodb.org/mongo-driver dependency-version: 1.17.9 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> |
5 days ago |
|
|
f8d783f80e
|
fix: ListObjectVersions interleave Version and DeleteMarker in sort order (#8567)
* fix: ListObjectVersions interleave Version and DeleteMarker in sort order Go's default xml.Marshal serializes struct fields in definition order, causing all <Version> elements to appear before all <DeleteMarker> elements. The S3 API contract requires these elements to be interleaved in the correct global sort order (by key ascending, then newest version first within each key). This broke clients that validate version list ordering within a single key — an older Version would appear before a newer DeleteMarker for the same object. Fix: Replace the separate Versions/DeleteMarkers/CommonPrefixes arrays with a single Entries []VersionListEntry slice. Each VersionListEntry uses a per-element MarshalXML that outputs the correct XML tag name (<Version>, <DeleteMarker>, or <CommonPrefixes>) based on which field is populated. Since the entries are already in their correct sorted order from buildSortedCombinedList, the XML output is automatically interleaved correctly. Also removes the unused ListObjectVersionsResult struct. Note: The reporter also mentioned a cross-key timestamp ordering issue when paginating with max-keys=1, but that is correct S3 behavior — ListObjectVersions sorts by key name (ascending), not by timestamp. Different keys having non-monotonic timestamps is expected. * test: add CommonPrefixes XML marshaling coverage for ListObjectVersions * fix: validate VersionListEntry has exactly one field set in MarshalXML Return an error instead of silently emitting an empty <Version> element when no field (or multiple fields) are populated. Also clean up the misleading xml:"Version" struct tag on the Entries field. |
5 days ago |
|
|
120d38176f
|
build(deps): bump golang.org/x/sys from 0.41.0 to 0.42.0 (#8573)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.41.0 to 0.42.0. - [Commits](https://github.com/golang/sys/compare/v0.41.0...v0.42.0) --- updated-dependencies: - dependency-name: golang.org/x/sys dependency-version: 0.42.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> |
5 days ago |
|
|
55bce53953 |
reduce logs
|
5 days ago |
|
|
992db11d2b
|
iam: add IAM group management (#8560)
* iam: add Group message to protobuf schema Add Group message (name, members, policy_names, disabled) and add groups field to S3ApiConfiguration for IAM group management support (issue #7742). * iam: add group CRUD to CredentialStore interface and all backends Add group management methods (CreateGroup, GetGroup, DeleteGroup, ListGroups, UpdateGroup) to the CredentialStore interface with implementations for memory, filer_etc, postgres, and grpc stores. Wire group loading/saving into filer_etc LoadConfiguration and SaveConfiguration. * iam: add group IAM response types Add XML response types for group management IAM actions: CreateGroup, DeleteGroup, GetGroup, ListGroups, AddUserToGroup, RemoveUserFromGroup, AttachGroupPolicy, DetachGroupPolicy, ListAttachedGroupPolicies, ListGroupsForUser. * iam: add group management handlers to embedded IAM API Add CreateGroup, DeleteGroup, GetGroup, ListGroups, AddUserToGroup, RemoveUserFromGroup, AttachGroupPolicy, DetachGroupPolicy, ListAttachedGroupPolicies, and ListGroupsForUser handlers with dispatch in ExecuteAction. * iam: add group management handlers to standalone IAM API Add group handlers (CreateGroup, DeleteGroup, GetGroup, ListGroups, AddUserToGroup, RemoveUserFromGroup, AttachGroupPolicy, DetachGroupPolicy, ListAttachedGroupPolicies, ListGroupsForUser) and wire into DoActions dispatch. Also add helper functions for user/policy side effects. * iam: integrate group policies into authorization Add groups and userGroups reverse index to IdentityAccessManagement. Populate both maps during ReplaceS3ApiConfiguration and MergeS3ApiConfiguration. Modify evaluateIAMPolicies to evaluate policies from user's enabled groups in addition to user policies. Update VerifyActionPermission to consider group policies when checking hasAttachedPolicies. * iam: add group side effects on user deletion and rename When a user is deleted, remove them from all groups they belong to. When a user is renamed, update group membership references. Applied to both embedded and standalone IAM handlers. * iam: watch /etc/iam/groups directory for config changes Add groups directory to the filer subscription watcher so group file changes trigger IAM configuration reloads. * admin: add group management page to admin UI Add groups page with CRUD operations, member management, policy attachment, and enable/disable toggle. Register routes in admin handlers and add Groups entry to sidebar navigation. * test: add IAM group management integration tests Add comprehensive integration tests for group CRUD, membership, policy attachment, policy enforcement, disabled group behavior, user deletion side effects, and multi-group membership. Add "group" test type to CI matrix in s3-iam-tests workflow. * iam: address PR review comments for group management - Fix XSS vulnerability in groups.templ: replace innerHTML string concatenation with DOM APIs (createElement/textContent) for rendering member and policy lists - Use userGroups reverse index in embedded IAM ListGroupsForUser for O(1) lookup instead of iterating all groups - Add buildUserGroupsIndex helper in standalone IAM handlers; use it in ListGroupsForUser and removeUserFromAllGroups for efficient lookup - Add note about gRPC store load-modify-save race condition limitation * iam: add defensive copies, validation, and XSS fixes for group management - Memory store: clone groups on store/retrieve to prevent mutation - Admin dash: deep copy groups before mutation, validate user/policy exists - HTTP handlers: translate credential errors to proper HTTP status codes, use *bool for Enabled field to distinguish missing vs false - Groups templ: use data attributes + event delegation instead of inline onclick for XSS safety, prevent stale async responses * iam: add explicit group methods to PropagatingCredentialStore Add CreateGroup, GetGroup, DeleteGroup, ListGroups, and UpdateGroup methods instead of relying on embedded interface fallthrough. Group changes propagate via filer subscription so no RPC propagation needed. * iam: detect postgres unique constraint violation and add groups index Return ErrGroupAlreadyExists when INSERT hits SQLState 23505 instead of a generic error. Add index on groups(disabled) for filtered queries. * iam: add Marker field to group list response types Add Marker string field to GetGroupResult, ListGroupsResult, ListAttachedGroupPoliciesResult, and ListGroupsForUserResult to match AWS IAM pagination response format. * iam: check group attachment before policy deletion Reject DeletePolicy if the policy is attached to any group, matching AWS IAM behavior. Add PolicyArn to ListAttachedGroupPolicies response. * iam: include group policies in IAM authorization Merge policy names from user's enabled groups into the IAMIdentity used for authorization, so group-attached policies are evaluated alongside user-attached policies. * iam: check for name collision before renaming user in UpdateUser Scan identities and inline policies for newUserName before mutating, returning EntityAlreadyExists if a collision is found. Reuse the already-loaded policies instead of loading them again inside the loop. * test: use t.Cleanup for bucket cleanup in group policy test * iam: wrap ErrUserNotInGroup sentinel in RemoveGroupMember error Wrap credential.ErrUserNotInGroup so errors.Is works in groupErrorToHTTPStatus, returning proper 400 instead of 500. * admin: regenerate groups_templ.go with XSS-safe data attributes Regenerated from groups.templ which uses data-group-name attributes instead of inline onclick with string interpolation. * iam: add input validation and persist groups during migration - Validate nil/empty group name in CreateGroup and UpdateGroup - Save groups in migrateToMultiFile so they survive legacy migration * admin: use groupErrorToHTTPStatus in GetGroupMembers and GetGroupPolicies * iam: short-circuit UpdateUser when newUserName equals current name * iam: require empty PolicyNames before group deletion Reject DeleteGroup when group has attached policies, matching the existing members check. Also fix GetGroup error handling in DeletePolicy to only skip ErrGroupNotFound, not all errors. * ci: add weed/pb/** to S3 IAM test trigger paths * test: replace time.Sleep with require.Eventually for propagation waits Use polling with timeout instead of fixed sleeps to reduce flakiness in integration tests waiting for IAM policy propagation. * fix: use credentialManager.GetPolicy for AttachGroupPolicy validation Policies created via CreatePolicy through credentialManager are stored in the credential store, not in s3cfg.Policies (which only has static config policies). Change AttachGroupPolicy to use credentialManager.GetPolicy() for policy existence validation. * feat: add UpdateGroup handler to embedded IAM API Add UpdateGroup action to enable/disable groups and rename groups via the IAM API. This is a SeaweedFS extension (not in AWS SDK) used by tests to toggle group disabled status. * fix: authenticate raw IAM API calls in group tests The embedded IAM endpoint rejects anonymous requests. Replace callIAMAPI with callIAMAPIAuthenticated that uses JWT bearer token authentication via the test framework. * feat: add UpdateGroup handler to standalone IAM API Mirror the embedded IAM UpdateGroup handler in the standalone IAM API for parity. * fix: add omitempty to Marker XML tags in group responses Non-truncated responses should not emit an empty <Marker/> element. * fix: distinguish backend errors from missing policies in AttachGroupPolicy Return ServiceFailure for credential manager errors instead of masking them as NoSuchEntity. Also switch ListGroupsForUser to use s3cfg.Groups instead of in-memory reverse index to avoid stale data. Add duplicate name check to UpdateGroup rename. * fix: standalone IAM AttachGroupPolicy uses persisted policy store Check managed policies from GetPolicies() instead of s3cfg.Policies so dynamically created policies are found. Also add duplicate name check to UpdateGroup rename. * fix: rollback inline policies on UpdateUser PutPolicies failure If PutPolicies fails after moving inline policies to the new username, restore both the identity name and the inline policies map to their original state to avoid a partial-write window. * fix: correct test cleanup ordering for group tests Replace scattered defers with single ordered t.Cleanup in each test to ensure resources are torn down in reverse-creation order: remove membership, detach policies, delete access keys, delete users, delete groups, delete policies. Move bucket cleanup to parent test scope and delete objects before bucket. * fix: move identity nil check before map lookup and refine hasAttachedPolicies Move the nil check on identity before accessing identity.Name to prevent panic. Also refine hasAttachedPolicies to only consider groups that are enabled and have actual policies attached, so membership in a no-policy group doesn't incorrectly trigger IAM authorization. * fix: fail group reload on unreadable or corrupt group files Return errors instead of logging and continuing when group files cannot be read or unmarshaled. This prevents silently applying a partial IAM config with missing group memberships or policies. * fix: use errors.Is for sql.ErrNoRows comparison in postgres group store * docs: explain why group methods skip propagateChange Group changes propagate to S3 servers via filer subscription (watching /etc/iam/groups/) rather than gRPC RPCs, since there are no group-specific RPCs in the S3 cache protocol. * fix: remove unused policyNameFromArn and strings import * fix: update service account ParentUser on user rename When renaming a user via UpdateUser, also update ParentUser references in service accounts to prevent them from becoming orphaned after the next configuration reload. * fix: wrap DetachGroupPolicy error with ErrPolicyNotAttached sentinel Use credential.ErrPolicyNotAttached so groupErrorToHTTPStatus maps it to 400 instead of falling back to 500. * fix: use admin S3 client for bucket cleanup in enforcement test The user S3 client may lack permissions by cleanup time since the user is removed from the group in an earlier subtest. Use the admin S3 client to ensure bucket and object cleanup always succeeds. * fix: add nil guard for group param in propagating store log calls Prevent potential nil dereference when logging group.Name in CreateGroup and UpdateGroup of PropagatingCredentialStore. * fix: validate Disabled field in UpdateGroup handlers Reject values other than "true" or "false" with InvalidInputException instead of silently treating them as false. * fix: seed mergedGroups from existing groups in MergeS3ApiConfiguration Previously the merge started with empty group maps, dropping any static-file groups. Now seeds from existing iam.groups before overlaying dynamic config, and builds the reverse index after merging to avoid stale entries from overridden groups. * fix: use errors.Is for filer_pb.ErrNotFound comparison in group loading Replace direct equality (==) with errors.Is() to correctly match wrapped errors, consistent with the rest of the codebase. * fix: add ErrUserNotFound and ErrPolicyNotFound to groupErrorToHTTPStatus Map these sentinel errors to 404 so AddGroupMember and AttachGroupPolicy return proper HTTP status codes. * fix: log cleanup errors in group integration tests Replace fire-and-forget cleanup calls with error-checked versions that log failures via t.Logf for debugging visibility. * fix: prevent duplicate group test runs in CI matrix The basic lane's -run "TestIAM" regex also matched TestIAMGroup* tests, causing them to run in both the basic and group lanes. Replace with explicit test function names. * fix: add GIN index on groups.members JSONB for membership lookups Without this index, ListGroupsForUser and membership queries require full table scans on the groups table. * fix: handle cross-directory moves in IAM config subscription When a file is moved out of an IAM directory (e.g., /etc/iam/groups), the dir variable was overwritten with NewParentPath, causing the source directory change to be missed. Now also notifies handlers about the source directory for cross-directory moves. * fix: validate members/policies before deleting group in admin handler AdminServer.DeleteGroup now checks for attached members and policies before delegating to credentialManager, matching the IAM handler guards. * fix: merge groups by name instead of blind append during filer load Match the identity loader's merge behavior: find existing group by name and replace, only append when no match exists. Prevents duplicates when legacy and multi-file configs overlap. * fix: check DeleteEntry response error when cleaning obsolete group files Capture and log resp.Error from filer DeleteEntry calls during group file cleanup, matching the pattern used in deleteGroupFile. * fix: verify source user exists before no-op check in UpdateUser Reorder UpdateUser to find the source identity first and return NoSuchEntityException if not found, before checking if the rename is a no-op. Previously a non-existent user renamed to itself would incorrectly return success. * fix: update service account parent refs on user rename in embedded IAM The embedded IAM UpdateUser handler updated group membership but not service account ParentUser fields, unlike the standalone handler. * fix: replay source-side events for all handlers on cross-dir moves Pass nil newEntry to bucket, IAM, and circuit-breaker handlers for the source directory during cross-directory moves, so all watchers can clear caches for the moved-away resource. * fix: don't seed mergedGroups from existing iam.groups in merge Groups are always dynamic (from filer), never static (from s3.config). Seeding from iam.groups caused stale deleted groups to persist. Now only uses config.Groups from the dynamic filer config. * fix: add deferred user cleanup in TestIAMGroupUserDeletionSideEffect Register t.Cleanup for the created user so it gets cleaned up even if the test fails before the inline DeleteUser call. * fix: assert UpdateGroup HTTP status in disabled group tests Add require.Equal checks for 200 status after UpdateGroup calls so the test fails immediately on API errors rather than relying on the subsequent Eventually timeout. * fix: trim whitespace from group name in filer store operations Trim leading/trailing whitespace from group.Name before validation in CreateGroup and UpdateGroup to prevent whitespace-only filenames. Also merge groups by name during multi-file load to prevent duplicates. * fix: add nil/empty group validation in gRPC store Guard CreateGroup and UpdateGroup against nil group or empty name to prevent panics and invalid persistence. * fix: add nil/empty group validation in postgres store Guard CreateGroup and UpdateGroup against nil group or empty name to prevent panics from nil member access and empty-name row inserts. * fix: add name collision check in embedded IAM UpdateUser The embedded IAM handler renamed users without checking if the target name already existed, unlike the standalone handler. * fix: add ErrGroupNotEmpty sentinel and map to HTTP 409 AdminServer.DeleteGroup now wraps conflict errors with ErrGroupNotEmpty, and groupErrorToHTTPStatus maps it to 409 Conflict instead of 500. * fix: use appropriate error message in GetGroupDetails based on status Return "Group not found" only for 404, use "Failed to retrieve group" for other error statuses instead of always saying "Group not found". * fix: use backend-normalized group.Name in CreateGroup response After credentialManager.CreateGroup may normalize the name (e.g., trim whitespace), use group.Name instead of the raw input for the returned GroupData to ensure consistency. * fix: add nil/empty group validation in memory store Guard CreateGroup and UpdateGroup against nil group or empty name to prevent panics from nil pointer dereference on map access. * fix: reorder embedded IAM UpdateUser to verify source first Find the source identity before checking for collisions, matching the standalone handler's logic. Previously a non-existent user renamed to an existing name would get EntityAlreadyExists instead of NoSuchEntity. * fix: handle same-directory renames in metadata subscription Replay a delete event for the old entry name during same-directory renames so handlers like onBucketMetadataChange can clean up stale state for the old name. * fix: abort GetGroups on non-ErrGroupNotFound errors Only skip groups that return ErrGroupNotFound. Other errors (e.g., transient backend failures) now abort the handler and return the error to the caller instead of silently producing partial results. * fix: add aria-label and title to icon-only group action buttons Add accessible labels to View and Delete buttons so screen readers and tooltips provide meaningful context. * fix: validate group name in saveGroup to prevent invalid filenames Trim whitespace and reject empty names before writing group JSON files, preventing creation of files like ".json". * fix: add /etc/iam/groups to filer subscription watched directories The groups directory was missing from the watched directories list, so S3 servers in a cluster would not detect group changes made by other servers via filer. The onIamConfigChange handler already had code to handle group directory changes but it was never triggered. * add direct gRPC propagation for group changes to S3 servers Groups now have the same dual propagation as identities and policies: direct gRPC push via propagateChange + async filer subscription. - Add PutGroup/RemoveGroup proto messages and RPCs - Add PutGroup/RemoveGroup in-memory cache methods on IAM - Add PutGroup/RemoveGroup gRPC server handlers - Update PropagatingCredentialStore to call propagateChange on group mutations * reduce log verbosity for config load summary Change ReplaceS3ApiConfiguration log from Infof to V(1).Infof to avoid noisy output on every config reload. * admin: show user groups in view and edit user modals - Add Groups field to UserDetails and populate from credential manager - Show groups as badges in user details view modal - Add group management to edit user modal: display current groups, add to group via dropdown, remove from group via badge x button * fix: remove duplicate showAlert that broke modal-alerts.js admin.js defined showAlert(type, message) which overwrote the modal-alerts.js version showAlert(message, type), causing broken unstyled alert boxes. Remove the duplicate and swap all callers in admin.js to use the correct (message, type) argument order. * fix: unwrap groups API response in edit user modal The /api/groups endpoint returns {"groups": [...]}, not a bare array. * Update object_store_users_templ.go * test: assert AccessDenied error code in group denial tests Replace plain assert.Error checks with awserr.Error type assertion and AccessDenied code verification, matching the pattern used in other IAM integration tests. * fix: propagate GetGroups errors in ShowGroups handler getGroupsPageData was swallowing errors and returning an empty page with 200 status. Now returns the error so ShowGroups can respond with a proper error status. * fix: reject AttachGroupPolicy when credential manager is nil Previously skipped policy existence validation when credentialManager was nil, allowing attachment of nonexistent policies. Now returns a ServiceFailureException error. * fix: preserve groups during partial MergeS3ApiConfiguration updates UpsertIdentity calls MergeS3ApiConfiguration with a partial config containing only the updated identity (nil Groups). This was wiping all in-memory group state. Now only replaces groups when config.Groups is non-nil (full config reload). * fix: propagate errors from group lookup in GetObjectStoreUserDetails ListGroups and GetGroup errors were silently ignored, potentially showing incomplete group data in the UI. * fix: use DOM APIs for group badge remove button to prevent XSS Replace innerHTML with onclick string interpolation with DOM createElement + addEventListener pattern. Also add aria-label and title to the add-to-group button. * fix: snapshot group policies under RLock to prevent concurrent map access evaluateIAMPolicies was copying the map reference via groupMap := iam.groups under RLock then iterating after RUnlock, while PutGroup mutates the map in-place. Now copies the needed policy names into a slice while holding the lock. * fix: add nil IAM check to PutGroup and RemoveGroup gRPC handlers Match the nil guard pattern used by PutPolicy/DeletePolicy to prevent nil pointer dereference when IAM is not initialized. |
5 days ago |
|
|
115dcb5ada
|
build(deps): bump github.com/prometheus/procfs from 0.19.2 to 0.20.1 (#8578)
Bumps [github.com/prometheus/procfs](https://github.com/prometheus/procfs) from 0.19.2 to 0.20.1. - [Release notes](https://github.com/prometheus/procfs/releases) - [Commits](https://github.com/prometheus/procfs/compare/v0.19.2...v0.20.1) --- updated-dependencies: - dependency-name: github.com/prometheus/procfs dependency-version: 0.20.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> |
5 days ago |
|
|
7be2d1ecfb
|
build(deps): bump github.com/getsentry/sentry-go from 0.42.0 to 0.43.0 (#8577)
Bumps [github.com/getsentry/sentry-go](https://github.com/getsentry/sentry-go) from 0.42.0 to 0.43.0. - [Release notes](https://github.com/getsentry/sentry-go/releases) - [Changelog](https://github.com/getsentry/sentry-go/blob/master/CHANGELOG.md) - [Commits](https://github.com/getsentry/sentry-go/compare/v0.42.0...v0.43.0) --- updated-dependencies: - dependency-name: github.com/getsentry/sentry-go dependency-version: 0.43.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> |
5 days ago |
|
|
1272612bbd
|
build(deps): bump docker/setup-qemu-action from 3 to 4 (#8574)
Bumps [docker/setup-qemu-action](https://github.com/docker/setup-qemu-action) from 3 to 4. - [Release notes](https://github.com/docker/setup-qemu-action/releases) - [Commits](https://github.com/docker/setup-qemu-action/compare/v3...v4) --- updated-dependencies: - dependency-name: docker/setup-qemu-action dependency-version: '4' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> |
5 days ago |
|
|
e568d85a5c
|
build(deps): bump docker/build-push-action from 6 to 7 (#8572)
Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 6 to 7. - [Release notes](https://github.com/docker/build-push-action/releases) - [Commits](https://github.com/docker/build-push-action/compare/v6...v7) --- updated-dependencies: - dependency-name: docker/build-push-action dependency-version: '7' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> |
5 days ago |
|
|
f79ba1eb37
|
build(deps): bump docker/login-action from 3 to 4 (#8569)
Bumps [docker/login-action](https://github.com/docker/login-action) from 3 to 4. - [Release notes](https://github.com/docker/login-action/releases) - [Commits](https://github.com/docker/login-action/compare/v3...v4) --- updated-dependencies: - dependency-name: docker/login-action dependency-version: '4' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> |
5 days ago |
|
|
b132232895
|
build(deps): bump docker/setup-buildx-action from 3 to 4 (#8570)
Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 3 to 4. - [Release notes](https://github.com/docker/setup-buildx-action/releases) - [Commits](https://github.com/docker/setup-buildx-action/compare/v3...v4) --- updated-dependencies: - dependency-name: docker/setup-buildx-action dependency-version: '4' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> |
5 days ago |
|
|
d765ff50e6
|
build(deps): bump actions/dependency-review-action from 4.8.3 to 4.9.0 (#8571)
Bumps [actions/dependency-review-action](https://github.com/actions/dependency-review-action) from 4.8.3 to 4.9.0.
- [Release notes](https://github.com/actions/dependency-review-action/releases)
- [Commits](
|
5 days ago |
|
|
bff084ff6a |
update go version
|
5 days ago |
|
|
78a3441b30
|
fix: volume balance detection returns multiple tasks per run (#8559)
* fix: volume balance detection now returns multiple tasks per run (#8551) Previously, detectForDiskType() returned at most 1 balance task per disk type, making the MaxJobsPerDetection setting ineffective. The detection loop now iterates within each disk type, planning multiple moves until the imbalance drops below threshold or maxResults is reached. Effective volume counts are adjusted after each planned move so the algorithm correctly re-evaluates which server is overloaded. * fix: factor pending tasks into destination scoring and use UnixNano for task IDs - Use UnixNano instead of Unix for task IDs to avoid collisions when multiple tasks are created within the same second - Adjust calculateBalanceScore to include LoadCount (pending + assigned tasks) in the utilization estimate, so the destination picker avoids stacking multiple planned moves onto the same target disk * test: add comprehensive balance detection tests for complex scenarios Cover multi-server convergence, max-server shifting, destination spreading, pre-existing pending task skipping, no-duplicate-volume invariant, and parameterized convergence verification across different cluster shapes and thresholds. * fix: address PR review findings in balance detection - hasMore flag: compute from len(results) >= maxResults so the scheduler knows more pages may exist, matching vacuum/EC handler pattern - Exhausted server fallthrough: when no eligible volumes remain on the current maxServer (all have pending tasks) or destination planning fails, mark the server as exhausted and continue to the next overloaded server instead of stopping the entire detection loop - Return canonical destination server ID directly from createBalanceTask instead of resolving via findServerIDByAddress, eliminating the fragile address→ID lookup for adjustment tracking - Fix bestScore sentinel: use math.Inf(-1) instead of -1.0 so disks with negative scores (high pending load, same rack/DC) are still selected as the best available destination - Add TestDetection_ExhaustedServerFallsThrough covering the scenario where the top server's volumes are all blocked by pre-existing tasks * test: fix computeEffectiveCounts and add len guard in no-duplicate test - computeEffectiveCounts now takes a servers slice to seed counts for all known servers (including empty ones) and uses an address→ID map from the topology spec instead of scanning metrics, so destination servers with zero initial volumes are tracked correctly - TestDetection_NoDuplicateVolumesAcrossIterations now asserts len > 1 before checking duplicates, so the test actually fails if Detection regresses to returning a single task * fix: remove redundant HasAnyTask check in createBalanceTask The HasAnyTask check in createBalanceTask duplicated the same check already performed in detectForDiskType's volume selection loop. Since detection runs single-threaded (MaxDetectionConcurrency: 1), no race can occur between the two points. * fix: consistent hasMore pattern and remove double-counted LoadCount in scoring - Adopt vacuum_handler's hasMore pattern: over-fetch by 1, check len > maxResults, and truncate — consistent truncation semantics - Remove direct LoadCount penalty in calculateBalanceScore since LoadCount is already factored into effectiveVolumeCount for utilization scoring; bump utilization weight from 40 to 50 to compensate for the removed 10-point load penalty * fix: handle zero maxResults as no-cap, emit trace after trim, seed empty servers - When MaxResults is 0 (omitted), treat as no explicit cap instead of defaulting to 1; only apply the +1 over-fetch probe when caller supplies a positive limit - Move decision trace emission after hasMore/trim so the trace accurately reflects the returned proposals - Seed serverVolumeCounts from ActiveTopology so servers that have a matching disk type but zero volumes are included in the imbalance calculation and MinServerCount check * fix: nil-guard clusterInfo, uncap legacy DetectionFunc, deterministic disk type order - Add early nil guard for clusterInfo in Detection to prevent panics in downstream helpers (detectForDiskType, createBalanceTask) - Change register.go DetectionFunc wrapper from maxResults=1 to 0 (no cap) so the legacy code path returns all detected tasks - Sort disk type keys before iteration so results are deterministic when maxResults spans multiple disk types (HDD/SSD) * fix: don't over-fetch in stateful detection to avoid orphaned pending tasks Detection registers planned moves in ActiveTopology via AddPendingTask, so requesting maxResults+1 would create an extra pending task that gets discarded during trim. Use len(results) >= maxResults as the hasMore signal instead, which is correct since Detection already caps internally. * fix: return explicit truncated flag from Detection instead of approximating Detection now returns (results, truncated, error) where truncated is true only when the loop stopped because it hit maxResults, not when it ran out of work naturally. This eliminates false hasMore signals when detection happens to produce exactly maxResults results by resolving the imbalance. * cleanup: simplify detection logic and remove redundancies - Remove redundant clusterInfo nil check in detectForDiskType since Detection already guards against nil clusterInfo - Remove adjustments loop for destination servers not in serverVolumeCounts — topology seeding ensures all servers with matching disk type are already present - Merge two-loop min/max calculation into a single loop: min across all servers, max only among non-exhausted servers - Replace magic number 100 with len(metrics) for minC initialization in convergence test * fix: accurate truncation flag, deterministic server order, indexed volume lookup - Track balanced flag to distinguish "hit maxResults cap" from "cluster balanced at exactly maxResults" — truncated is only true when there's genuinely more work to do - Sort servers for deterministic iteration and tie-breaking when multiple servers have equal volume counts - Pre-index volumes by server with per-server cursors to avoid O(maxResults * volumes) rescanning on each iteration - Add truncation flag assertions to RespectsMaxResults test: true when capped, false when detection finishes naturally * fix: seed trace server counts from ActiveTopology to match detection logic The decision trace was building serverVolumeCounts only from metrics, missing zero-volume servers seeded from ActiveTopology by Detection. This could cause the trace to report wrong server counts, incorrect imbalance ratios, or spurious "too few servers" messages. Pass activeTopology into the trace function and seed server counts the same way Detection does. * fix: don't exhaust server on per-volume planning failure, sort volumes by ID - When createBalanceTask returns nil, continue to the next volume on the same server instead of marking the entire server as exhausted. The failure may be volume-specific (not found in topology, pending task registration failed) and other volumes on the server may still be viable candidates. - Sort each server's volume slice by VolumeID after pre-indexing so volume selection is fully deterministic regardless of input order. * fix: use require instead of assert to prevent nil dereference panic in CORS test The test used assert.NoError (non-fatal) for GetBucketCors, then immediately accessed getResp.CORSRules. When the API returns an error, getResp is nil causing a panic. Switch to require.NoError/NotNil/Len so the test stops before dereferencing a nil response. * fix: deterministic disk tie-breaking and stronger pre-existing task test - Sort available disks by NodeID then DiskID before scoring so destination selection is deterministic when two disks score equally - Add task count bounds assertion to SkipsPreExistingPendingTasks test: with 15 of 20 volumes already having pending tasks, at most 5 new tasks should be created and at least 1 (imbalance still exists) * fix: seed adjustments from existing pending/assigned tasks to prevent over-scheduling Detection now calls ActiveTopology.GetTaskServerAdjustments() to initialize the adjustments map with source/destination deltas from existing pending and assigned balance tasks. This ensures effectiveCounts reflects in-flight moves, preventing the algorithm from planning additional moves in the same direction when prior moves already address the imbalance. Added GetTaskServerAdjustments(taskType) to ActiveTopology which iterates pending and assigned tasks, decrementing source servers and incrementing destination servers for the given task type. |
6 days ago |
|
|
2ec0a67ee3
|
master: return 503/Unavailable during topology warmup after leader change (#8529)
* master: return 503/Unavailable during topology warmup after leader change After a master restart or leader change, the topology is empty until volume servers reconnect and send heartbeats. During this warmup window (3 heartbeat intervals = 15 seconds), volume lookups that fail now return 503 Service Unavailable (HTTP) or gRPC Unavailable instead of 404 Not Found, signaling clients to retry with other masters. * master: skip warmup 503 on fresh start and single-master setups - Check MaxVolumeId > 0 to distinguish restart from fresh start (MaxVolumeId is Raft-persisted, so 0 means no prior data) - Check peer count > 1 so single-master deployments aren't affected (no point suggesting "retry with other masters" if there are none) * master: address review feedback and block assigns during warmup - Protect LastLeaderChangeTime with dedicated mutex (fix data race) - Extract warmup multiplier as WarmupPulseMultiplier constant - Derive Retry-After header from pulse config instead of hardcoding - Only trigger warmup 503 for "not found" errors, not parse errors - Return nil response (not partial) on gRPC Unavailable - Add doc comments to IsWarmingUp, getter/setter, WarmupDuration - Block volume assign requests (HTTP and gRPC) during warmup, since the topology is incomplete and assignments would be unreliable - Skip warmup behavior for single-master setups (no peers to retry) * master: apply warmup to all setups, skip only on fresh start Single-master restarts still have an empty topology until heartbeats arrive, so warmup protection should apply there too. The only case to skip is a fresh cluster start (MaxVolumeId == 0), which already has no volumes to look up. - Remove GetMasterCount() > 1 guard from all warmup checks - Remove now-unused GetMasterCount helper - Update error messages to "topology is still loading" (not "retry with other masters" which doesn't apply to single-master) * master: add client-side retry on Unavailable for lookup and assign The server-side 503/Unavailable during warmup needs client cooperation. Previously, LookupVolumeIds and Assign would immediately propagate the error without retry. Now both paths retry with exponential backoff (1s -> 1.5s -> ... up to 6s) when receiving Unavailable, respecting context cancellation. This covers the warmup window where the master's topology is still loading after a restart or leader change. * master: seed warmup timestamp in legacy raft path at setup The legacy raft path only set lastLeaderChangeTime inside the event listener callback, which could fire after IsLeader() was already observed as true in SetRaftServer. Seed the timestamp at setup time (matching the hashicorp path) so IsWarmingUp() is active immediately. * master: fix assign retry loop to cover full warmup window The retry loop used waitTime <= maxWaitTime as a stop condition, causing it to give up after ~13s while warmup lasts 15s. Now cap each individual sleep at maxWaitTime but keep retrying until the context is cancelled. * master: preserve gRPC status in lookup retry and fix retry window Return the raw gRPC error instead of wrapping with fmt.Errorf so status.FromError() can extract the status code. Use proper gRPC status check (codes.Unavailable) instead of string matching. Also cap individual sleep at maxWaitTime while retrying until ctx is done. * master: use gRPC status code instead of string matching in assign retry Use status.FromError/codes.Unavailable instead of brittle strings.Contains for detecting retriable gRPC errors in the assign retry loop. * master: use remaining warmup duration for Retry-After header Set Retry-After to the remaining warmup time instead of the full warmup duration, so clients don't wait longer than necessary. * master: reset ret.Replicas before populating from assign response Clear Replicas slice before appending to prevent duplicate entries when the assign response is retried or when alternative requests are attempted. * master: add unit tests for warmup retry behavior Test that Assign() and LookupVolumeIds() retry on codes.Unavailable and stop promptly when the context is cancelled. * master: record leader change time before initialization work Move SetLastLeaderChangeTime() to fire immediately when the leader change event is received, before DoBarrier(), EnsureTopologyId(), and updatePeers(), so the warmup clock starts at the true moment of leadership transition. * master: use topology warmup duration in volume growth wait loop Replace hardcoded constants.VolumePulsePeriod * 2 with topo.IsWarmingUp() and topo.WarmupDuration() so the growth wait stays in sync with the configured warmup window. Remove unused constants import. * master: resolve master before creating RPC timeout context Move GetMaster() call before context.WithTimeout() so master resolution blocking doesn't consume the gRPC call timeout. * master: use NotFound flag instead of string matching for volume lookup Add a NotFound field to LookupResult and set it in findVolumeLocation when a volume is genuinely missing. Update HTTP and gRPC warmup checks to use this flag instead of strings.Contains on the error message. * master: bound assign retry loop to 30s for deadline-free contexts Without a context deadline, the Unavailable retry loop could spin forever. Add a maxRetryDuration of 30s so the loop gives up even when no context deadline is set. * master: strengthen assign retry cancellation test Verify the retry loop actually retried (callCount > 1) and that the returned error is context.DeadlineExceeded, not just any error. * master: extract shared retry-with-backoff utility Add util.RetryWithBackoff for context-aware, bounded retry with exponential backoff. Refactor both Assign() and LookupVolumeIds() to use it instead of duplicating the retry/sleep/backoff logic. * master: cap waitTime in RetryWithBackoff to prevent unbounded growth Cap the backoff waitTime at maxWaitTime so it doesn't grow indefinitely in long-running retry scenarios. * master: only return Unavailable during warmup when all lookups failed For batched LookupVolume requests, return partial results when some volumes are found. Only return codes.Unavailable when no volumes were successfully resolved, so clients benefit from partial results instead of retrying unnecessarily. * master: set retriable error message in 503 response body When returning 503 during warmup, replace the "not found" error in the JSON body with "service warming up, please retry" so clients don't treat it as a permanent error. * master: guard empty master address in LookupVolumeIds If GetMaster() returns empty (no master found or ctx cancelled), return an appropriate error instead of dialing an empty address. Returns ctx.Err() if context is done, otherwise codes.Unavailable to trigger retry. * master: add comprehensive tests for RetryWithBackoff Test success after retries, non-retryable error handling, context cancellation, and maxDuration cap with context.Background(). * master: enforce hard maxDuration bound in RetryWithBackoff Use a deadline instead of elapsed-time check so the last sleep is capped to remaining time. This prevents the total retry duration from overshooting maxDuration by up to one full backoff interval. * master: respect fresh-start bypass in RemainingWarmupDuration Check IsWarmingUp() first (which returns false when MaxVolumeId==0) so RemainingWarmupDuration returns 0 on fresh clusters. * master: round up Retry-After seconds to avoid underestimating Use math.Ceil so fractional remaining seconds (e.g. 1.9s) round up to the next integer (2) instead of flooring down (1). * master: tighten batch lookup warmup to all-NotFound only Only return codes.Unavailable when every requested volume ID was a transient not-found. Mixed cases with non-NotFound errors now return the response with per-volume error details preserved. * master: reduce retry log noise and fix timer leak Lower per-attempt retry log from V(0) to V(1) to reduce noise during warmup. Replace time.After with time.NewTimer to avoid lingering timers when context is cancelled. * master: add per-attempt timeout for assign RPC Use a 10s per-attempt timeout so a single slow RPC can't consume the entire 30s retry budget when ctx has no deadline. * master: share single 30s retry deadline across assign request entries The Assign() function iterates over primary and fallback requests, previously giving each its own 30s RetryWithBackoff budget. With a primary + fallback, the total could reach 60s. Compute one deadline up front and pass the remaining budget to each RetryWithBackoff call so the entire Assign() call stays within a single 30s cap. * master: strengthen context-cancel test with DeadlineExceeded and retry assertions Assert errors.Is(err, context.DeadlineExceeded) to verify the error is specifically from the context deadline, and check callCount > 1 to prove retries actually occurred before cancellation. Mirrors the pattern used in TestAssignStopsOnContextCancel. * master: bound GetMaster with per-attempt timeout in LookupVolumeIds GetMaster() calls WaitUntilConnected() which can block indefinitely if no master is available. Previously it used the outer ctx, so a slow master resolution could consume the entire RetryWithBackoff budget in a single attempt. Move the per-attempt timeoutCtx creation before the GetMaster call so both master resolution and the gRPC LookupVolume RPC share one grpcTimeout-bounded attempt. * master: use deadline-aware context for assign retry budget The shared 30s deadline only limited RetryWithBackoff's internal wall-clock tracking, but per-attempt contexts were still derived from the original ctx and could run for up to 10s even when the budget was nearly exhausted. Create a deadlineCtx from the computed deadline and derive both RetryWithBackoff and per-attempt timeouts from it so all operations honor the shared 30s cap. * master: skip warmup gate for empty lookup requests When VolumeOrFileIds is empty, notFoundCount == len(req.VolumeOrFileIds) is 0 == 0 which is true, causing empty lookup batches during warmup to return codes.Unavailable and be retried endlessly. Add a len(req.VolumeOrFileIds) > 0 guard so empty requests pass through. * master: validate request fields before warmup gate in Assign Move Replication and Ttl parsing before the IsWarmingUp() check so invalid inputs get a proper validation error instead of being masked by codes.Unavailable during warmup. Pure syntactic validation does not depend on topology state and should run first. * master: check deadline and context before starting retry attempt RetryWithBackoff only checked the deadline and context after an attempt completed or during the sleep select. If the deadline expired or context was canceled during sleep, the next iteration would still call operation() before detecting it. Add pre-operation checks so no new attempt starts after the budget is exhausted. * master: always return ctx.Err() on context cancellation in RetryWithBackoff When ctx.Err() is non-nil, the pre-operation check was returning lastErr instead of ctx.Err(). This broke callers checking errors.Is(err, context.DeadlineExceeded) and contradicted the documented contract. Always return ctx.Err() so the cancellation reason is properly surfaced. * master: handle warmup errors in StreamAssign without killing the stream StreamAssign was returning codes.Unavailable errors from Assign directly, which terminates the gRPC stream and breaks pooled connections. Instead, return transient errors as in-band error responses so the stream survives warmup periods. Also reset assignClient in doAssign on Send/Recv failures so a broken stream doesn't leave the proxy permanently dead. * master: wait for warmup before slot search in findAndGrow findEmptySlotsForOneVolume was called before the warmup wait loop, selecting slots from an incomplete topology. Move the warmup wait before slot search so volume placement uses the fully warmed-up topology with all servers registered. * master: add Retry-After header to /dir/assign warmup response The /dir/lookup handler already sets Retry-After during warmup but /dir/assign did not, leaving HTTP clients without guidance on when to retry. Add the same header using RemainingWarmupDuration(). * master: only seed warmup timestamp on leader at startup SetLastLeaderChangeTime was called unconditionally for both leader and follower nodes. Followers don't need warmup state, and the leader change event listener handles real elections. Move the seed into the IsLeader() block so only the startup leader gets warmup initialized. * master: preserve codes.Unavailable for StreamAssign warmup errors in doAssign StreamAssign returns transient warmup errors as in-band AssignResponse.Error messages. doAssign was converting these to plain fmt.Errorf, losing the codes.Unavailable classification needed for the caller's retry logic. Detect warmup error messages and wrap them as status.Error(codes.Unavailable) so RetryWithBackoff can retry. |
6 days ago |
|
|
0647f66bb5
|
filer.sync: add exponential backoff on unexpected EOF during replication (#8557)
* filer.sync: add exponential backoff on unexpected EOF during replication When the source volume server drops connections under high traffic, filer.sync retries aggressively (every 1-6s), hammering the already overloaded source. This adds a longer exponential backoff (10s to 2min) specifically for "unexpected EOF" errors, reducing pressure on the source while still retrying indefinitely until success. Also adds more logging throughout the replication path: - Log source URL and error at V(0) when ReadPart or io.ReadAll fails - Log content-length and byte counts at V(4) on success - Log backoff duration in retry messages Fixes #8542 * filer.sync: extract backoff helper and fix 2-minute cap - Extract nextEofBackoff() and isEofError() helpers to deduplicate the backoff logic between fetchAndWrite and uploadManifestChunk - Fix the cap: previously 80s would double to 160s and pass the < 2min check uncapped. Now doubles first, then clamps to 2min. * filer.sync: log source URL instead of empty upload URL on read errors UploadUrl is not populated until after the reader is consumed, so the V(0) and V(4) logs were printing an empty string. Add SourceUrl field to UploadOption and populate it from the HTTP response in fetchAndWrite. * filer.sync: guard isEofError against nil error * filer.sync: use errors.Is for EOF detection, fix log wording - Replace broad substring matching ("read input", "unexpected EOF") with errors.Is(err, io.ErrUnexpectedEOF) and errors.Is(err, io.EOF) so only actual EOF errors trigger the longer backoff - Fix awkward log phrasing: "interrupted replicate" → "interrupted while replicating" * filer.sync: remove EOF backoff from uploadManifestChunk uploadManifestChunk reads from an in-memory bytes.Reader, so any EOF errors there are from the destination side, not a broken source stream. The long source-oriented backoff is inappropriate; let RetryUntil handle destination retries at its normal cadence. --------- Co-authored-by: Copilot <copilot@github.com> |
6 days ago |
|
|
ba66411337 |
Update plugin_templ.go
|
6 days ago |