Tree:
5acb4578ab
add-ec-vacuum
add-filer-iam-grpc
add-iam-grpc-management
add_fasthttp_client
add_remote_storage
adding-message-queue-integration-tests
adjust-fsck-cutoff-default
admin/csrf-s3tables
allow-no-role-arn
also-delete-parent-directory-if-empty
avoid_releasing_temp_file_on_write
changing-to-zap
codex-rust-volume-server-bootstrap
codex/admin-oidc-auth-ui
codex/cache-iam-policy-engines
codex/ec-repair-worker
codex/erasure-coding-shard-distribution
codex/list-object-versions-newest-first
codex/s3tables-maint-lifecycle-parity
codex/s3tables-maint-planner-multispec
codex/s3tables-maintenance-designs
collect-public-metrics
copilot/fix-helm-chart-installation
copilot/fix-s3-object-tagging-issue
copilot/make-renew-interval-configurable
copilot/make-renew-interval-configurable-again
copilot/sub-pr-7677
create-table-snapshot-api-design
data_query_pushdown
dependabot/maven/other/java/client/com.google.protobuf-protobuf-java-3.25.5
dependabot/maven/other/java/examples/org.apache.hadoop-hadoop-common-3.4.0
detect-and-plan-ec-tasks
do-not-retry-if-error-is-NotFound
ec-disk-type-support
enhance-erasure-coding
expand-the-s3-PutObject-permission-to-the-multipart-permissions
fasthttp
feature-8113-storage-class-disk-routing
feature/mini-port-detection
feature/modernize-s3-tests
feature/s3-multi-cert-support
feature/s3tables-improvements-and-spark-tests
feature/sra-uds-handler
feature/sw-block
filer1_maintenance_branch
fix-8303-s3-lifecycle-ttl-assign
fix-GetObjectLockConfigurationHandler
fix-bucket-name-case-7910
fix-helm-fromtoml-compatibility
fix-mount-http-parallelism
fix-mount-read-throughput-7504
fix-pr-7909
fix-s3-configure-consistency
fix-s3-object-tagging-issue-7589
fix-sts-session-token-7941
fix-versioning-listing-only
fix/iceberg-stage-create-semantics
fix/lock-table-shared-lock-precedence
fix/mount-cache-consistency
fix/object-lock-delete-enforcement
fix/plugin-ui-remove-scheduler-settings
fix/sts-body-preservation
fix/windows-test-file-cleanup
ftp
gh-pages
has-weed-sql-command
iam-multi-file-migration
iam-permissions-and-api
improve-fuse-mount
improve-fuse-mount2
logrus
master
message_send
mount2
mq-subscribe
mq2
nfs-cookie-prefix-list-fixes
optimize-delete-lookups
original_weed_mount
plugin-system-phase1
plugin-ui-enhancements-restored
pr-7412
pr/7984
pr/8140
pr/8680
raft-dual-write
random_access_file
refactor-needle-read-operations
refactor-volume-write
remote_overlay
remove-implicit-directory-handling
revert-5134-patch-1
revert-5819-patch-1
revert-6434-bugfix-missing-s3-audit
rust-volume-server
s3-remote-cache-singleflight
s3-select
s3tables-by-claude
scheduler-sequential-iteration
sub
tcp_read
test-reverting-lock-table
test_udp
testing
testing-sdx-generation
tikv
track-mount-e2e
upgrade-versions-to-4.00
volume_buffered_writes
worker-execute-ec-tasks
0.72
0.72.release
0.73
0.74
0.75
0.76
0.77
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1.00
1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.10
1.11
1.12
1.14
1.15
1.16
1.17
1.18
1.19
1.20
1.21
1.22
1.23
1.24
1.25
1.26
1.27
1.28
1.29
1.30
1.31
1.32
1.33
1.34
1.35
1.36
1.37
1.38
1.40
1.41
1.42
1.43
1.44
1.45
1.46
1.47
1.48
1.49
1.50
1.51
1.52
1.53
1.54
1.55
1.56
1.57
1.58
1.59
1.60
1.61
1.61RC
1.62
1.63
1.64
1.65
1.66
1.67
1.68
1.69
1.70
1.71
1.72
1.73
1.74
1.75
1.76
1.77
1.78
1.79
1.80
1.81
1.82
1.83
1.84
1.85
1.86
1.87
1.88
1.90
1.91
1.92
1.93
1.94
1.95
1.96
1.97
1.98
1.99
1;70
2.00
2.01
2.02
2.03
2.04
2.05
2.06
2.07
2.08
2.09
2.10
2.11
2.12
2.13
2.14
2.15
2.16
2.17
2.18
2.19
2.20
2.21
2.22
2.23
2.24
2.25
2.26
2.27
2.28
2.29
2.30
2.31
2.32
2.33
2.34
2.35
2.36
2.37
2.38
2.39
2.40
2.41
2.42
2.43
2.47
2.48
2.49
2.50
2.51
2.52
2.53
2.54
2.55
2.56
2.57
2.58
2.59
2.60
2.61
2.62
2.63
2.64
2.65
2.66
2.67
2.68
2.69
2.70
2.71
2.72
2.73
2.74
2.75
2.76
2.77
2.78
2.79
2.80
2.81
2.82
2.83
2.84
2.85
2.86
2.87
2.88
2.89
2.90
2.91
2.92
2.93
2.94
2.95
2.96
2.97
2.98
2.99
3.00
3.01
3.02
3.03
3.04
3.05
3.06
3.07
3.08
3.09
3.10
3.11
3.12
3.13
3.14
3.15
3.16
3.18
3.19
3.20
3.21
3.22
3.23
3.24
3.25
3.26
3.27
3.28
3.29
3.30
3.31
3.32
3.33
3.34
3.35
3.36
3.37
3.38
3.39
3.40
3.41
3.42
3.43
3.44
3.45
3.46
3.47
3.48
3.50
3.51
3.52
3.53
3.54
3.55
3.56
3.57
3.58
3.59
3.60
3.61
3.62
3.63
3.64
3.65
3.66
3.67
3.68
3.69
3.71
3.72
3.73
3.74
3.75
3.76
3.77
3.78
3.79
3.80
3.81
3.82
3.83
3.84
3.85
3.86
3.87
3.88
3.89
3.90
3.91
3.92
3.93
3.94
3.95
3.96
3.97
3.98
3.99
4.00
4.01
4.02
4.03
4.04
4.05
4.06
4.07
4.08
4.09
4.12
4.13
4.15
4.16
4.17
dev
helm-3.65.1
v0.69
v0.70beta
v3.33
${ noResults }
8451 Commits (5acb4578abef9601bde1e764cc1da2a540c8b84c)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
5acb4578ab
|
Fix ec.rebuild failing on unrepairable volumes instead of skipping (#8632)
* Fix ec.rebuild failing on unrepairable volumes instead of skipping them When an EC volume has fewer shards than DataShardsCount, ec.rebuild would return an error and abort the entire operation. Now it logs a warning and continues rebuilding the remaining volumes. Fixes #8630 * Remove duplicate volume ID in unrepairable log message --------- Co-authored-by: Copilot <copilot@github.com> |
5 days ago |
|
|
2f51a94416
|
feat(vacuum): add volume state and location filters to vacuum handler (#8625)
* feat(vacuum): add volume state, location, and enhanced collection filters Align the vacuum handler's admin config with the balance handler by adding: - volume_state filter (ALL/ACTIVE/FULL) to scope vacuum to writable or read-only volumes - data_center_filter, rack_filter, node_filter to scope vacuum to specific infrastructure locations - Enhanced collection_filter description matching the balance handler's ALL_COLLECTIONS/EACH_COLLECTION/regex modes The new filters reuse filterMetricsByVolumeState() and filterMetricsByLocation() already defined in the same package. * use wildcard matchers for DC/rack/node filters Replace exact-match and CSV set lookups with wildcard matching from util/wildcard package. Patterns like "dc*", "rack-1?", or "node-a*" are now supported in all location filter fields for both balance and vacuum handlers. * add nil guard in filterMetricsByLocation |
6 days ago |
|
|
6fc0489dd8
|
feat(plugin): make page tabs and sub-tabs addressable by URLs (#8626)
* feat(plugin): make page tabs and sub-tabs addressable by URLs Update the plugin page so that clicking tabs and sub-tabs pushes browser history via history.pushState(), enabling bookmarkable URLs, browser back/forward navigation, and shareable links. URL mapping: - /plugin → Overview tab - /plugin/configuration → Configuration sub-tab - /plugin/detection → Job Detection sub-tab - /plugin/queue → Job Queue sub-tab - /plugin/execution → Job Execution sub-tab Job-type-specific URLs use the ?job= query parameter (e.g., /plugin/configuration?job=vacuum) so that a specific job type tab is pre-selected on page load. Changes: - Add initialJob parameter to Plugin() template and handler - Extract ?job= query param in renderPluginPage handler - Add buildPluginURL/updateURL helpers in JavaScript - Push history state on top-tab, sub-tab, and job-type clicks - Listen for popstate to restore tab state on back/forward - Replace initial history entry on page load via replaceState * make popstate handler async with proper error handling Await loadDescriptorAndConfig so data loading completes before rendering dependent views. Log errors instead of silently swallowing them. |
6 days ago |
|
|
baae672b6f
|
feat: auto-disable master vacuum when plugin worker is active (#8624)
* feat: auto-disable master vacuum when plugin vacuum worker is active When a vacuum-capable plugin worker connects to the admin server, the admin server calls DisableVacuum on the master to prevent the automatic scheduled vacuum from conflicting with the plugin worker's vacuum. When the worker disconnects, EnableVacuum is called to restore the default behavior. A safety net in the topology refresh loop re-enables vacuum if the admin server disconnects without cleanup. * rename isAdminServerConnected to isAdminServerConnectedFunc * add 5s timeout to DisableVacuum/EnableVacuum gRPC calls Prevents the monitor goroutine from blocking indefinitely if the master is unresponsive. * track plugin ownership of vacuum disable to avoid overriding operator - Add vacuumDisabledByPlugin flag to Topology, set when DisableVacuum is called while admin server is connected (i.e., by plugin monitor) - Safety net only re-enables vacuum when it was disabled by plugin, not when an operator intentionally disabled it via shell command - EnableVacuum clears the plugin flag * extract syncVacuumState for testability, add fake toggler tests Extract the single sync step into syncVacuumState() with a vacuumToggler interface. Add TestSyncVacuumState with a fake toggler that verifies disable/enable calls on state transitions. * use atomic.Bool for isDisableVacuum and vacuumDisabledByPlugin Both fields are written by gRPC handlers and read by the vacuum goroutine, causing a data race. Use atomic.Bool with Store/Load for thread-safe access. * use explicit by_plugin field instead of connection heuristic Add by_plugin bool to DisableVacuumRequest proto so the caller declares intent explicitly. The admin server monitor sets it to true; shell commands leave it false. This prevents an operator's intentional disable from being auto-reversed by the safety net. * use setter for admin server callback instead of function parameter Move isAdminServerConnected from StartRefreshWritableVolumes parameter to Topology.SetAdminServerConnectedFunc() setter. Keeps the function signature stable and decouples the topology layer from the admin server concept. * suppress repeated log messages on persistent sync failures Add retrying parameter to syncVacuumState so the initial state transition is logged at V(0) but subsequent retries of the same transition are silent until the call succeeds. * clear plugin ownership flag on manual DisableVacuum Prevents stale plugin flag from causing incorrect auto-enable when an operator manually disables vacuum after a plugin had previously disabled it. * add by_plugin to EnableVacuumRequest for symmetric ownership tracking Plugin-driven EnableVacuum now only re-enables if the plugin was the one that disabled it. If an operator manually disabled vacuum after the plugin, the plugin's EnableVacuum is a no-op. This prevents the plugin monitor from overriding operator intent on worker disconnect. * use cancellable context for monitorVacuumWorker goroutine Replace context.Background() with a cancellable context stored as bgCancel on AdminServer. Shutdown() calls bgCancel() so monitorVacuumWorker exits cleanly via ctx.Done(). * track operator and plugin vacuum disables independently Replace single isDisableVacuum flag with two independent flags: vacuumDisabledByOperator and vacuumDisabledByPlugin. Each caller only flips its own flag. The effective disabled state is the OR of both. This prevents a plugin connect/disconnect cycle from overriding an operator's manual disable, and vice versa. * fix safety net to clear plugin flag, not operator flag The safety net should call EnableVacuumByPlugin() to clear only the plugin disable flag when the admin server disconnects. The previous call to EnableVacuum() incorrectly cleared the operator flag instead. |
6 days ago |
|
|
89ccb6d825 |
use constants
|
6 days ago |
|
|
f48725a31d |
add more tests
|
6 days ago |
|
|
8056b702ba
|
feat(balance): replica placement validation for volume moves (#8622)
* feat(balance): add replica placement validation for volume moves When the volume balance detection proposes moving a volume, validate that the move does not violate the volume's replication policy (e.g., ReplicaPlacement=010 requires replicas on different racks). If the preferred destination violates the policy, fall back to score-based planning; if that also violates, skip the volume entirely. - Add ReplicaLocation type and VolumeReplicaMap to ClusterInfo - Build replica map from all volumes before collection filtering - Port placement validation logic from command_volume_fix_replication.go - Thread replica map through collectVolumeMetrics call chain - Add IsGoodMove check in createBalanceTask before destination use * address PR review: extract validation closure, add defensive checks - Extract validateMove closure to eliminate duplicated ReplicaLocation construction and IsGoodMove calls - Add defensive check for empty replica map entries (len(replicas) == 0) - Add bounds check for int-to-byte cast on ExpectedReplicas (0-255) * address nitpick: rp test helper accepts *testing.T and fails on error Prevents silent failures from typos in replica placement codes. * address review: add composite replica placement tests (011, 110) Test multi-constraint placement policies where both rack and DC rules must be satisfied simultaneously. * address review: use struct keys instead of string concatenation Replace string-concatenated map keys with typed rackKey/nodeKey structs to eliminate allocations and avoid ambiguity if IDs contain spaces. * address review: simplify bounds check, log fallback error, guard source - Remove unreachable ExpectedReplicas < 0 branch (outer condition already guarantees > 0), fold bounds check into single condition - Log error from planBalanceDestination in replica validation fallback - Return false from IsGoodMove when sourceNodeID not found in existing replicas (inconsistent cluster state) * address review: use slices.Contains instead of hand-rolled helpers Replace isAmongDC and isAmongRack with slices.Contains from the standard library, reducing boilerplate. |
6 days ago |
|
|
47ddf05d95
|
feat(plugin): DC/rack/node filtering for volume balance (#8621)
* feat(plugin): add DC/rack/node filtering for volume balance detection Add scoping filters so balance detection can be limited to specific data centers, racks, or nodes. Filters are applied both at the metrics level (in the handler) and at the topology seeding level (in detection) to ensure only the targeted infrastructure participates in balancing. * address PR review: use set lookups, deduplicate test helpers, add target checks * address review: assert non-empty tasks in filter tests Prevent vacuous test passes by requiring len(tasks) > 0 before checking source/target exclusions. * address review: enforce filter scope in fallback, clarify DC filter - Thread allowedServers into createBalanceTask so the fallback planner cannot produce out-of-scope targets when DC/rack/node filters are active - Update data_center_filter description to clarify single-DC usage * address review: centralize parseCSVSet, fix filter scope leak, iterate all targets - Extract ParseCSVSet to shared weed/worker/tasks/util package, remove duplicates from detection.go and volume_balance_handler.go - Fix metric accumulation re-introducing filtered-out servers by only counting metrics for servers that passed DC/rack/node filters - Trim DataCenterFilter before matching to handle trailing spaces - Iterate all task.TypedParams.Targets in filter tests, not just [0] * remove useless descriptor string test |
6 days ago |
|
|
00ce1c6eba
|
feat(plugin): enhanced collection filtering for volume balance (#8620)
* feat(plugin): enhanced collection filtering for volume balance Replace wildcard matching with three collection filter modes: - ALL_COLLECTIONS (default): treat all volumes as one pool - EACH_COLLECTION: run detection separately per collection - Regex pattern: filter volumes by matching collection names The EACH_COLLECTION mode extracts distinct collections from metrics and calls Detection() per collection, sharing the maxResults budget and clusterInfo (with ActiveTopology) across all calls. * address PR review: fix wildcard→regexp replacement, optimize EACH_COLLECTION * address nitpick: fail fast on config errors (invalid regex) Add configError type so invalid collection_filter regex returns immediately instead of retrying across all masters with the same bad config. Transient errors still retry. * address review: constants, unbounded maxResults, wildcard compat - Define collectionFilterAll/collectionFilterEach constants to eliminate magic strings across handler and metrics code - Fix EACH_COLLECTION budget loop to treat maxResults <= 0 as unbounded, matching Detection's existing semantics - Treat "*" as ALL_COLLECTIONS for backward compat with wildcard * address review: nil guard in EACH_COLLECTION grouping loop * remove useless descriptor string test |
6 days ago |
|
|
577a8459c9
|
fix(mount): return dropped error (#8623)
|
6 days ago |
|
|
34fe289f32
|
feat(balance): add volume state filter (ALL/ACTIVE/FULL) (#8619)
* feat(balance): add volume state filter (ALL/ACTIVE/FULL) Add a volume_state admin config field to the plugin worker volume balance handler, matching the shell's -volumeBy flag. This allows filtering volumes by state before balance detection: - ALL (default): consider all volumes - ACTIVE: only writable volumes below the size limit (FullnessRatio < 1.01) - FULL: only read-only volumes above the size limit (FullnessRatio >= 1.01) The 1.01 threshold mirrors the shell's thresholdVolumeSize constant. * address PR review: use enum/select widget, switch-based filter, nil safety - Change volume_state field from string/text to enum/select with dropdown options (ALL, ACTIVE, FULL) - Refactor filterMetricsByVolumeState to use switch with predicate function for clearer extensibility - Add nil-check guard to prevent panic on nil metric elements - Add TestFilterMetricsByVolumeState_NilElement regression test |
6 days ago |
|
|
f3c5ba3cd6
|
feat(filer): add lazy directory listing for remote mounts (#8615)
* feat(filer): add lazy directory listing for remote mounts Directory listings on remote mounts previously only queried the local filer store. With lazy mounts the listing was empty; with eager mounts it went stale over time. Add on-demand directory listing that fetches from remote and caches results with a 5-minute TTL: - Add `ListDirectory` to `RemoteStorageClient` interface (delimiter-based, single-level listing, separate from recursive `Traverse`) - Implement in S3, GCS, and Azure backends using each platform's hierarchical listing API - Add `maybeLazyListFromRemote` to filer: before each directory listing, check if the directory is under a remote mount with an expired cache, fetch from remote, persist entries to the local store, then let existing listing logic run on the populated store - Use singleflight to deduplicate concurrent requests for the same directory - Skip local-only entries (no RemoteEntry) to avoid overwriting unsynced uploads - Errors are logged and swallowed (availability over consistency) * refactor: extract xattr key to constant xattrRemoteListingSyncedAt * feat: make listing cache TTL configurable per mount via listing_cache_ttl_seconds Add listing_cache_ttl_seconds field to RemoteStorageLocation protobuf. When 0 (default), lazy directory listing is disabled for that mount. When >0, enables on-demand directory listing with the specified TTL. Expose as -listingCacheTTL flag on remote.mount command. * refactor: address review feedback for lazy directory listing - Add context.Context to ListDirectory interface and all implementations - Capture startTime before remote call for accurate TTL tracking - Simplify S3 ListDirectory using ListObjectsV2PagesWithContext - Make maybeLazyListFromRemote return void (errors always swallowed) - Remove redundant trailing-slash path manipulation in caller - Update tests to match new signatures * When an existing entry has Remote != nil, we should merge remote metadata into it rather than replacing it. * fix(gcs): wrap ListDirectory iterator error with context The raw iterator error was returned without bucket/path context, making it harder to debug. Wrap it consistently with the S3 pattern. * fix(s3): guard against nil pointer dereference in Traverse and ListDirectory Some S3-compatible backends may return nil for LastModified, Size, or ETag fields. Check for nil before dereferencing to prevent panics. * fix(filer): remove blanket 2-minute timeout from lazy listing context Individual SDK operations (S3, GCS, Azure) already have per-request timeouts and retry policies. The blanket timeout could cut off large directory listings mid-operation even though individual pages were succeeding. * fix(filer): preserve trace context in lazy listing with WithoutCancel Use context.WithoutCancel(ctx) instead of context.Background() so trace/span values from the incoming request are retained for distributed tracing, while still decoupling cancellation. * fix(filer): use Store.FindEntry for internal lookups, add Uid/Gid to files, fix updateDirectoryListingSyncedAt - Use f.Store.FindEntry instead of f.FindEntry for staleness check and child lookups to avoid unnecessary lazy-fetch overhead - Set OS_UID/OS_GID on new file entries for consistency with directories - In updateDirectoryListingSyncedAt, use Store.UpdateEntry for existing directories instead of CreateEntry to avoid deleteChunksIfNotNew and NotifyUpdateEvent side effects * fix(filer): distinguish not-found from store errors in lazy listing Previously, any error from Store.FindEntry was treated as "not found," which could cause entry recreation/overwrite on transient DB failures. Now check for filer_pb.ErrNotFound explicitly and skip entries or bail out on real store errors. * refactor(filer): use errors.Is for ErrNotFound comparisons |
7 days ago |
|
|
a6774f0e01 |
add git commit hash on admin ui
|
7 days ago |
|
|
0e570d6a8f
|
feat(remote.mount): add -metadataStrategy flag to control metadata caching (#8568)
* feat(remote): add -noSync flag to skip upfront metadata pull on mount Made-with: Cursor * refactor(remote): split mount setup from metadata sync Extract ensureMountDirectory for create/validate; call pullMetadata directly when sync is needed. Caller controls sync step for -noSync. Made-with: Cursor * fix(remote): validate mount root when -noSync so bad bucket/creds fail fast When -noSync is used, perform a cheap remote check (ListBuckets and verify bucket exists) instead of skipping all remote I/O. Invalid buckets or credentials now fail at mount time. Made-with: Cursor * test(remote): add TestRemoteMountNoSync for -noSync mount and persisted mapping Made-with: Cursor * test(remote): assert no upfront metadata after -noSync mount After remote.mount -noSync, run fs.ls on the mount dir and assert empty listing so the test fails if pullMetadata was invoked eagerly. Made-with: Cursor * fix(remote): propagate non-ErrNotFound lookup errors in ensureMountDirectory Return lookupErr immediately for any LookupDirectoryEntry failure that is not filer_pb.ErrNotFound, so only the not-found case creates the entry and other lookup failures are reported to the caller. Made-with: Cursor * fix(remote): use errors.Is for ErrNotFound in ensureMountDirectory Replace fragile strings.Contains(lookupErr.Error(), ...) with errors.Is(lookupErr, filer_pb.ErrNotFound) before calling CreateEntry. Made-with: Cursor * fix(remote): use LookupEntry so ErrNotFound is recognised after gRPC Raw gRPC LookupDirectoryEntry returns a status error, not the sentinel, so errors.Is(lookupErr, filer_pb.ErrNotFound) was always false. Use filer_pb.LookupEntry which normalises not-found to ErrNotFound so the mount directory is created when missing. Made-with: Cursor * test(remote): ignore weed shell banner in TestRemoteMountNoSync fs.ls count Exclude master/filer and prompt lines from entry count so the assertion checks only actual fs.ls output for empty -noSync mount. Made-with: Cursor * fix(remote.mount): use 0755 for mount dir, document bucket-less early return Made-with: Cursor * feat(remote.mount): replace -noSync with -metadataStrategy=lazy|eager - Add -metadataStrategy flag (eager default, lazy skips upfront metadata pull) - Accept lazy/eager case-insensitively; reject invalid values with clear error - Rename TestRemoteMountNoSync to TestRemoteMountMetadataStrategyLazy - Add TestRemoteMountMetadataStrategyEager and TestRemoteMountMetadataStrategyInvalid Made-with: Cursor * fix(remote.mount): validate strategy and remote before creating mount directory Move strategy validation and validateMountRoot (lazy path) before ensureMountDirectory so that invalid strategies or bad bucket/credentials fail without leaving orphaned directory entries in the filer. * refactor(remote.mount): remove unused remote param from ensureMountDirectory The remote *RemoteStorageLocation parameter was left over from the old syncMetadata signature. Only remoteConf.Name is used inside the function. * doc(remote.mount): add TODO for HeadBucket-style validation validateMountRoot currently lists all buckets to verify one exists. Note the need for a targeted BucketExists method in the interface. * refactor(remote.mount): use MetadataStrategy type and constants Replace raw string comparisons with a MetadataStrategy type and MetadataStrategyEager/MetadataStrategyLazy constants for clarity and compile-time safety. * refactor(remote.mount): rename MetadataStrategy to MetadataCacheStrategy More precisely describes the purpose: controlling how metadata is cached from the remote, not metadata handling in general. * fix(remote.mount): remove validateMountRoot from lazy path Lazy mount's purpose is to skip remote I/O. Validating via ListBuckets contradicts that, especially on accounts with many buckets. Invalid buckets or credentials will surface on first lazy access instead. * fix(test): handle shell exit 0 in TestRemoteMountMetadataStrategyInvalid The weed shell process exits with code 0 even when individual commands fail — errors appear in stdout. Check output instead of requiring a non-nil error. * test(remote.mount): remove metadataStrategy shell integration tests These tests only verify string output from a shell process that always exits 0 — they cannot meaningfully validate eager vs lazy behavior without a real remote backend. --------- Co-authored-by: Chris Lu <chris.lu@gmail.com> |
1 week ago |
|
|
146a090754
|
filer: propagate lazy metadata deletes to remote mounts (#8522)
* filer: propagate lazy metadata deletes to remote mounts Delete operations now call the remote backend for mounted remote-only entries before removing filer metadata, keeping remote state aligned and preserving retry semantics on remote failures. Made-with: Cursor * filer: harden remote delete metadata recovery Persist remote-delete metadata pendings so local entry removal can be retried after failures, and return explicit errors when remote client resolution fails to prevent silent local-only deletes. Made-with: Cursor * filer: streamline remote delete client lookup and logging Avoid a redundant mount trie traversal by resolving the remote client directly from the matched mount location, and add parity logging for successful remote directory deletions. Made-with: Cursor * filer: harden pending remote metadata deletion flow Retry pending-marker writes before local delete, fail closed when marking cannot be persisted, and start remote pending reconciliation only after the filer store is initialised to avoid nil store access. Made-with: Cursor * filer: avoid lazy fetch in pending metadata reconciliation Use a local-only entry lookup during pending remote metadata reconciliation so cache misses do not trigger remote lazy fetches. Made-with: Cursor * filer: serialise concurrent index read-modify-write in pending metadata deletion Add remoteMetadataDeletionIndexMu to Filer and acquire it for the full read→mutate→commit sequence in markRemoteMetadataDeletionPending and clearRemoteMetadataDeletionPending, preventing concurrent goroutines from overwriting each other's index updates. Made-with: Cursor * filer: start remote deletion reconciliation loop in NewFiler Move the background goroutine for pending remote metadata deletion reconciliation from SetStore (where it was gated by sync.Once) to NewFiler alongside the existing loopProcessingDeletion goroutine. The sync.Once approach was problematic: it buried a goroutine launch as a side effect of a setter, was unrecoverable if the goroutine panicked, could race with store initialisation, and coupled its lifecycle to unrelated shutdown machinery. The existing nil-store guard in reconcilePendingRemoteMetadataDeletions handles the window before SetStore is called. * filer: skip remote delete for replicated deletes from other filers When isFromOtherCluster is true the delete was already propagated to the remote backend by the originating filer. Repeating the remote delete on every replica doubles API calls, and a transient remote failure on the replica would block local metadata cleanup — leaving filers inconsistent. * filer: skip pending marking for directory remote deletes Directory remote deletes are idempotent and do not need the pending/reconcile machinery that was designed for file deletes where the local metadata delete might fail after the remote object is already removed. * filer: propagate remote deletes for children in recursive folder deletion doBatchDeleteFolderMetaAndData iterated child files but only called NotifyUpdateEvent and collected chunks — it never called maybeDeleteFromRemote for individual children. This left orphaned objects in the remote backend when a directory containing remote-only files was recursively deleted. Also fix isFromOtherCluster being hardcoded to false in the recursive call to doBatchDeleteFolderMetaAndData for subdirectories. * filer: simplify pending remote deletion tracking to single index key Replace the double-bookkeeping scheme (individual KV entry per path + newline-delimited index key) with a single index key that stores paths directly. This removes the per-path KV writes/deletes, the base64 encoding round-trip, and the transaction overhead that was only needed to keep the two representations in sync. * filer: address review feedback on remote deletion flow - Distinguish missing remote config from client initialization failure in maybeDeleteFromRemote error messages. - Use a detached context (30s timeout) for pending-mark and pending-clear KV writes so they survive request cancellation after the remote object has already been deleted. - Emit NotifyUpdateEvent in reconcilePendingRemoteMetadataDeletions after a successful retry deletion so downstream watchers and replicas learn about the eventual metadata removal. * filer: remove background reconciliation for pending remote deletions The pending-mark/reconciliation machinery (KV index, mutex, background loop, detached contexts) handled the narrow case where the remote object was deleted but the subsequent local metadata delete failed. The client already receives the error and can retry — on retry the remote not-found is treated as success and the local delete proceeds normally. The added complexity (and new edge cases around NotifyUpdateEvent, multi-filer consistency during reconciliation, and context lifetime) is not justified for a transient store failure the caller already handles. Remove: loopProcessingRemoteMetadataDeletionPending, reconcilePendingRemoteMetadataDeletions, markRemoteMetadataDeletionPending, clearRemoteMetadataDeletionPending, listPendingRemoteMetadataDeletionPaths, encodePendingRemoteMetadataDeletionIndex, FindEntryLocal, and all associated constants, fields, and test infrastructure. * filer: fix test stubs and add early exit on child remote delete error - Refactor stubFilerStore to release lock before invoking callbacks and propagate callback errors, preventing potential deadlocks in tests - Implement ListDirectoryPrefixedEntries with proper prefix filtering instead of delegating to the unfiltered ListDirectoryEntries - Add continue after setting err on child remote delete failure in doBatchDeleteFolderMetaAndData to skip further processing of the failed entry * filer: propagate child remote delete error instead of silently continuing Replace `continue` with early `break` when maybeDeleteFromRemote fails for a child entry during recursive folder deletion. The previous `continue` skipped the error check at the end of the loop body, so a subsequent successful entry would overwrite err and the remote delete error was silently lost. Now the loop breaks, the existing error check returns the error, and NotifyUpdateEvent / chunk collection are correctly skipped for the failed entry. * filer: delete remote file when entry has Remote pointer, not only when remote-only Replace IsInRemoteOnly() guard with entry.Remote == nil check in maybeDeleteFromRemote. IsInRemoteOnly() requires zero local chunks and RemoteSize > 0, which incorrectly skips remote deletion for cached files (local chunks exist) and zero-byte remote objects (RemoteSize 0). The correct condition is whether the entry has a remote backing object at all. --------- Co-authored-by: Chris Lu <chris.lu@gmail.com> |
1 week ago |
|
|
92a76fc1a2
|
fix(filer): limit concurrent proxy reads per volume server (#8608)
* fix(filer): limit concurrent proxy reads per volume server Add a per-volume-server semaphore (default 16) to proxyToVolumeServer to prevent replication bursts from overwhelming individual volume servers with hundreds of concurrent connections, which causes them to drop connections with "unexpected EOF". Excess requests queue up and respect the client's context, returning 503 if the client disconnects while waiting. Also log io.CopyBuffer errors that were previously silently discarded. * Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * fix(filer): use non-blocking release for proxy semaphore Prevents a goroutine from blocking forever if releaseProxySemaphore is ever called without a matching acquire. * test(filer): clean up proxySemaphores entries in all proxy tests --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> |
1 week ago |
|
|
b665c329bc
|
fix(replication): resume partial chunk reads on EOF instead of re-downloading (#8607)
* fix(replication): resume partial chunk reads on EOF instead of re-downloading When replicating chunks and the source connection drops mid-transfer, accumulate the bytes already received and retry with a Range header to fetch only the remaining bytes. This avoids re-downloading potentially large chunks from scratch on each retry, reducing load on busy source servers and speeding up recovery. * test(replication): add tests for downloadWithRange including gzip partial reads Tests cover: - No offset (no Range header sent) - With offset (Range header verified) - Content-Disposition filename extraction - Partial read + resume: server drops connection mid-transfer, client resumes with Range from the offset of received bytes - Gzip partial read + resume: first response is gzip-encoded (Go auto- decompresses), connection drops, resume request gets decompressed data (Go doesn't add Accept-Encoding when Range is set, so the server decompresses), combined bytes match original * fix(replication): address PR review comments - Consolidate downloadWithRange into DownloadFile with optional offset parameter (variadic), eliminating code duplication (DRY) - Validate HTTP response status: require 206 + correct Content-Range when offset > 0, reject when server ignores Range header - Use if/else for fullData assignment for clarity - Add test for rejected Range (server returns 200 instead of 206) * refactor(replication): remove unused ReplicationSource interface The interface was never referenced and its signature didn't match the actual FilerSource.ReadPart method. --------- Co-authored-by: Copilot <copilot@github.com> |
1 week ago |
|
|
e4a77b8b16
|
feat(admin): support env var and security.toml for credentials (#8606)
* feat(security): add [admin] section to security.toml scaffold Add admin credential fields (user, password, readonly.user, readonly.password) to security.toml. Via viper's WEED_ env prefix and AutomaticEnv(), these are automatically overridable as WEED_ADMIN_USER, WEED_ADMIN_PASSWORD, etc. Ref: https://github.com/seaweedfs/seaweedfs/discussions/8586 * feat(admin): support env var and security.toml fallbacks for credentials Add applyViperFallback() to read admin credentials from security.toml / WEED_* environment variables when CLI flags are not explicitly set. This allows systems like NixOS to pass secrets via env vars instead of CLI flags, which appear in process listings. Precedence: CLI flag > env var / security.toml > default value. Also change -adminUser default from "admin" to "" so that credentials are fully opt-in. Ref: https://github.com/seaweedfs/seaweedfs/discussions/8586 * feat(helm): use WEED_ env vars for admin credentials instead of CLI flags Rename SEAWEEDFS_ADMIN_USER/PASSWORD to WEED_ADMIN_USER/PASSWORD so viper picks them up natively. Remove -adminUser/-adminPassword shell expansion from command args since the Go binary now reads these directly via viper. * docs(admin): document env var and security.toml credential support Add environment variable mapping table, security.toml example, and precedence rules to the admin README. * style(security): use nested [admin.readonly] table in security.toml Use a nested TOML table instead of dotted keys for the readonly credentials. More idiomatic and easier to read; no change in how Viper parses it. * fix(admin): use util.GetViper() for env var support and fix README example applyViperFallback() was using viper.GetString() directly, which bypasses the WEED_ env prefix and AutomaticEnv setup that only happens in util.GetViper(). Switch to util.GetViper().GetString() so WEED_ADMIN_* environment variables are actually picked up. Also fix the README example to include WEED_ADMIN_USER alongside WEED_ADMIN_PASSWORD, since runAdmin() rejects an empty username when a password is set. * fix(admin): restore default adminUser to "admin" Defaulting adminUser to "" broke the common flow of setting only WEED_ADMIN_PASSWORD — runAdmin() rejects an empty username when a password is set. Restore "admin" as the default so that setting only the password works out of the box. * docs(admin): align README security.toml example with scaffold format Use nested [admin.readonly] table instead of flat dotted keys to match the format in weed/command/scaffold/security.toml. * docs(admin): remove README.md in favor of wiki page Admin documentation lives at the wiki (Admin-UI.md). Remove the in-repo README to avoid maintaining duplicate docs. --------- Co-authored-by: Copilot <copilot@github.com> |
1 week ago |
|
|
013362d2d3 |
fix(shell): show planned size in fs.mergeVolumes log to clarify size limit check (#8553)
The log message was comparing against the planned size of the destination volume (including volumes already planned to merge into it) but only displaying the raw volume size, making the output confusing when the displayed sizes clearly didn't add up to exceed the limit. |
1 week ago |
|
|
8ac4caf930 |
fix(s3api): return no-encryption instead of error when bucket metadata is missing
When getEncryptionConfiguration encounters a not-found error (e.g., during bucket recreation after a partial delete), return ErrNoSuchBucketEncryptionConfiguration instead of ErrInternalError. This prevents uploads from failing with 500 errors during recovery. |
1 week ago |
|
|
ab85f46529 |
fix(s3api): clear negative cache in autoCreateBucket when bucket exists
When autoCreateBucket finds the bucket already exists, remove it from the negative cache so subsequent requests don't unnecessarily trigger another auto-create attempt. |
1 week ago |
|
|
5208c7c727 |
fix(s3api): improve PutBucketHandler comment for orphaned collection recovery
Clarify the comment and log message for the case where a collection exists but the bucket directory is missing, explaining the root cause (partial deletion) more precisely. |
1 week ago |
|
|
12b360f499 |
fix(s3api): delete bucket directory before collection to prevent inconsistent state
Reorder DeleteBucketHandler to remove the bucket directory first, then delete the collection. If collection deletion fails, the bucket is still effectively deleted and can be recreated. Previously, if directory deletion succeeded but collection deletion failed, the bucket was left in an unrecoverable state. |
1 week ago |
|
|
d1a631123f
|
fix(s3api): allow bucket recreation when orphaned collection exists (#8605)
* fix(s3api): allow bucket recreation when orphaned collection exists (#8601) When a bucket is deleted, its filer directory is removed but the underlying collection/volumes may not be fully cleaned up yet. If the bucket is immediately recreated, PutBucketHandler was returning ErrBucketAlreadyExists due to the orphaned collection, blocking bucket recreation and causing subsequent uploads to fail with InternalError. Allow bucket creation to proceed when a collection exists without a corresponding bucket directory, since this is a transient orphaned state from a previous deletion. * fix(s3api): handle concurrent bucket creation race in mkdir On mkdir failure, re-check whether the bucket directory now exists and return BucketAlreadyExists instead of InternalError when another request created the bucket concurrently. |
1 week ago |
|
|
b799650357
|
fix(shell): set LastLocalSyncTsNs in remote.copy.local so remote.uncache works (#8604)
remote.uncache checks LastLocalSyncTsNs to determine if a file has been synced to remote. remote.copy.local was not setting this field, leaving it at 0, which caused uncache to skip all files uploaded via remote.copy.local. Fixes #8602 |
1 week ago |
|
|
2ff4a07544
|
Reduce task logger glog noise and remove per-write fsync (#8603)
* Reduce task logger noise: stop duplicating every log entry to glog and stderr Every task log entry was being tripled: written to the task log file, forwarded to glog (which writes to /tmp by default with no rotation), and echoed to stderr. This caused glog files to fill /tmp on long-running workers. - Remove INFO/DEBUG forwarding to glog (only ERROR/WARNING remain) - Remove stderr echo of every log line - Remove fsync on every single log write (unnecessary for log files) * Fix glog call depth for correct source file attribution The call stack is: caller → Error() → log() → writeLogEntry() → glog.ErrorDepth(), so depth=4 is needed for glog to report the original caller's file and line number. |
1 week ago |
|
|
4a5243886a |
4.17
|
1 week ago |
|
|
e1e4c9437a
|
fix(s3api): ListObjects with trailing-slash prefix matches sibling directories (#8599)
fix(s3api): ListObjects with trailing-slash prefix returns wrong results When ListObjectsV2 is called with a prefix ending in "/" (e.g., "foo/"), normalizePrefixMarker strips the trailing slash and splits into dir="parent" and prefix="foo". The filer then lists entries matching prefix "foo", which returns both directory "foo" and "foo1000". The prefixEndsOnDelimiter guard correctly identifies directory "foo" as the target and recurses into it, but then resets the guard to false. The loop continues and incorrectly recurses into "foo1000" as well, causing the listing to return objects from unrelated directories. Fix: after recursing into the exact directory targeted by the trailing-slash prefix, return immediately from the listing loop. There is no reason to process sibling entries since the original prefix specifically targeted one directory. |
1 week ago |
|
|
f950a941e3
|
Fix trust policy validation for specific AWS user principals (#8597)
* Add tests for AWS user principal in AssumeRole trust policies Add test cases that verify trust policy validation when using specific AWS user principals (e.g., "arn:aws:iam::000000000000:user/backend") in the Principal field of trust policies for AssumeRole. Covers single user, multiple users (array), wildcard, and plain string principal formats. These tests demonstrate the bug reported in #8588 where specific user principals always fail validation. * Populate RequestContext in ValidateTrustPolicyForPrincipal ValidateTrustPolicyForPrincipal was creating an EvaluationContext with a nil RequestContext. The policy engine's principal matching logic looks up "aws:PrincipalArn" in RequestContext for non-wildcard principals, so specific user ARNs like "arn:aws:iam::000000000000:user/backend" always failed to match, while wildcard "*" worked because it short-circuits before the lookup. Populate RequestContext with both "principal" and "aws:PrincipalArn" keys, consistent with how IsActionAllowed already does it. Fixes #8588 * Remove GitHub discussion URL from source code comments * Add specific error message assertions in trust policy tests |
1 week ago |
|
|
ac579c1746
|
Fix plugin configuration tab layout overflow (#8596)
Fix plugin configuration tab layout overflow (#8587) Remove h-100 from Job Scheduling Settings card, which caused it to stretch to 100% of the row height and push the Next Run card below the row boundary, overflowing into the Detection Results section. |
1 week ago |
|
|
0a5c5ed4ce
|
Persist S3 bucket counter metrics across idle periods (#8595)
* Stop deleting counter metrics during bucket TTL cleanup Counter metrics (traffic bytes, request counts, object counts) are monotonically increasing by design. Deleting them after 10 minutes of bucket inactivity causes them to vanish from /metrics output and reset to zero when traffic resumes, breaking Prometheus rate()/increase() queries and making historical traffic reporting impossible. Only delete gauges and histograms in the TTL cleanup loop, as these represent current state and are safely re-populated on next activity. Fixes https://github.com/seaweedfs/seaweedfs/issues/8521 * Clean up all bucket metrics on bucket deletion Add DeleteBucketMetrics() to delete all metrics (including counters) for a bucket when it is explicitly deleted. This prevents unbounded label cardinality from accumulating for buckets that no longer exist. Called from DeleteBucketHandler after successful bucket deletion. * Reduce mutex scope in bucket metrics TTL sweep Collect expired bucket names under the lock, then release before calling DeletePartialMatch on Prometheus metrics. This prevents RecordBucketActiveTime from blocking during the expensive cleanup. |
1 week ago |
|
|
0a2dac1e56 |
Reduce mutex scope in bucket metrics TTL sweep
Collect expired bucket names under the lock, then release before calling DeletePartialMatch on Prometheus metrics. This prevents RecordBucketActiveTime from blocking during the expensive cleanup. |
1 week ago |
|
|
737116e83c |
fix port probing
|
1 week ago |
|
|
07f3f5eec5 |
remove worker links
|
1 week ago |
|
|
47cad59c70
|
Remove misleading Workers sub-menu items from admin sidebar (#8594)
* Remove misleading Workers sub-menu items from admin sidebar The sidebar sub-items (Job Detection, Job Queue, Job Execution, Configuration) always navigated to the first job type's tabs (typically EC Encoding) rather than showing cross-job-type views. This was confusing as noted in #8590. Since the in-page tabs already provide this navigation, remove the redundant sidebar sub-items and keep only the top-level Workers link. Fixes #8590 * Update layout_templ.go |
1 week ago |
|
|
b17e2b411a
|
Add dynamic timeouts to plugin worker vacuum gRPC calls (#8593)
* add dynamic timeouts to plugin worker vacuum gRPC calls All vacuum gRPC calls used context.Background() with no deadline, so the plugin scheduler's execution timeout could kill a job while a large volume compact was still in progress. Use volume-size-scaled timeouts matching the topology vacuum approach: 3 min/GB for compact, 1 min/GB for check, commit, and cleanup. Fixes #8591 * scale scheduler execution timeout by volume size The scheduler's per-job execution timeout (default 240s) would kill vacuum jobs on large volumes before they finish. Three changes: 1. Vacuum detection now includes estimated_runtime_seconds in job proposals, computed as 5 min/GB of volume size. 2. The scheduler checks for estimated_runtime_seconds in job parameters and uses it as the execution timeout when larger than the default — a generic mechanism any handler can use. 3. Vacuum task gRPC calls now use the passed-in ctx as parent instead of context.Background(), so scheduler cancellation propagates to in-flight RPCs. * extend job type runtime when proposals need more time The JobTypeMaxRuntime (default 30 min) wraps both detection and execution. Its context is the parent of all per-job execution contexts, so even with per-job estimated_runtime_seconds, jobCtx would cancel everything when it expires. After detection, scan proposals for the maximum estimated_runtime_seconds. If any proposal needs more time than the remaining JobTypeMaxRuntime, create a new execution context with enough headroom. This lets large vacuum jobs complete without being killed by the job type deadline while still respecting the configured limit for normal-sized jobs. * log missing volume size metric, remove dead minimum runtime guard Add a debug log in vacuumTimeout when t.volumeSize is 0 so operators can investigate why metrics are missing for a volume. Remove the unreachable estimatedRuntimeSeconds < 180 check in buildVacuumProposal — volumeSizeGB always >= 1 (due to +1 floor), so estimatedRuntimeSeconds is always >= 300. * cap estimated runtime and fix status check context - Cap maxEstimatedRuntime and per-job timeout overrides to 8 hours to prevent unbounded timeouts from bad metrics. - Check execCtx.Err() instead of jobCtx.Err() for status reporting, since dispatch runs under execCtx which may have a longer deadline. A successful dispatch under execCtx was misreported as "timeout" when jobCtx had expired. |
1 week ago |
|
|
4c88fbfd5e
|
Fix nil pointer crash during concurrent vacuum compaction (#8592)
* check for nil needle map before compaction sync
When CommitCompact runs concurrently, it sets v.nm = nil under
dataFileAccessLock. CompactByIndex does not hold that lock, so
v.nm.Sync() can hit a nil pointer. Add an early nil check to
return an error instead of crashing.
Fixes #8591
* guard copyDataBasedOnIndexFile size check against nil needle map
The post-compaction size validation at line 538 accesses
v.nm.ContentSize() and v.nm.DeletedSize(). If CommitCompact has
concurrently set v.nm to nil, this causes a SIGSEGV. Skip the
validation when v.nm is nil since the actual data copy uses local
needle maps (oldNm/newNm) and is unaffected.
Fixes #8591
* use atomic.Bool for compaction flags to prevent concurrent vacuum races
The isCompacting and isCommitCompacting flags were plain bools
read and written from multiple goroutines without synchronization.
This allowed concurrent vacuums on the same volume to pass the
guard checks and run simultaneously, leading to the nil pointer
crash. Using atomic.Bool with CompareAndSwap ensures only one
compaction or commit can run per volume at a time.
Fixes #8591
* use go-version-file in CI workflows instead of hardcoded versions
Use go-version-file: 'go.mod' so CI automatically picks up the Go
version from go.mod, avoiding future version drift. Reordered
checkout before setup-go in go.yml and e2e.yml so go.mod is
available. Removed the now-unused GO_VERSION env vars.
* capture v.nm locally in CompactByIndex to close TOCTOU race
A bare nil check on v.nm followed by v.nm.Sync() has a race window
where CommitCompact can set v.nm = nil between the two. Snapshot
the pointer into a local variable so the nil check and Sync operate
on the same reference.
* add dynamic timeouts to plugin worker vacuum gRPC calls
All vacuum gRPC calls used context.Background() with no deadline,
so the plugin scheduler's execution timeout could kill a job while
a large volume compact was still in progress. Use volume-size-scaled
timeouts matching the topology vacuum approach: 3 min/GB for compact,
1 min/GB for check, commit, and cleanup.
Fixes #8591
* Revert "add dynamic timeouts to plugin worker vacuum gRPC calls"
This reverts commit
|
1 week ago |
|
|
d4d2e511ed |
for mini, default to bind all
|
1 week ago |
|
|
d89a78d9e3 |
reduce logs
|
1 week ago |
|
|
00000ec006 |
Update s3_buckets_templ.go
|
1 week ago |
|
|
1bd7a98a4a |
simplify plugin scheduler: remove configurable IdleSleepSeconds, use constant 61s
The SchedulerConfig struct and its persistence/API were unnecessary indirection. Replace with a simple constant (reduced from 613s to 61s) so the scheduler re-checks for detectable job types promptly after going idle, improving the clean-install experience. |
1 week ago |
|
|
8ad58e7002 |
4.16
|
1 week ago |
|
|
cf3693651c |
fix: add IdxFileSize check to pre-delete volume verification
The verification step checked DatFileSize and FileCount but not IdxFileSize, leaving a gap in the copy validation before source deletion. |
1 week ago |
|
|
5f85bf5e8a
|
Batch volume balance: run multiple moves per job (#8561)
* proto: add BalanceMoveSpec and batch fields to BalanceTaskParams Add BalanceMoveSpec message for encoding individual volume moves, and max_concurrent_moves + repeated moves fields to BalanceTaskParams to support batching multiple volume moves in a single job. * balance handler: add batch execution with concurrent volume moves Refactor Execute() into executeSingleMove() (backward compatible) and executeBatchMoves() which runs multiple volume moves concurrently using a semaphore-bounded goroutine pool. When BalanceTaskParams.Moves is populated, the batch path is taken; otherwise the single-move path. Includes aggregate progress reporting across concurrent moves, per-move error collection, and partial failure support. * balance handler: add batch config fields to Descriptor and worker config Add max_concurrent_moves and batch_size fields to the worker config form and deriveBalanceWorkerConfig(). These control how many volume moves run concurrently within a batch job and the maximum batch size. * balance handler: group detection proposals into batch jobs When batch_size > 1, the Detect method groups detection results into batch proposals where each proposal encodes multiple BalanceMoveSpec entries in BalanceTaskParams.Moves. Single-result batches fall back to the existing single-move proposal format for backward compatibility. * admin UI: add volume balance execution plan and batch badge Add renderBalanceExecutionPlan() for rich rendering of volume balance jobs in the job detail modal. Single-move jobs show source/target/volume info; batch jobs show a moves table with all volume moves. Add batch badge (e.g., "5 moves") next to job type in the execution jobs table when the job has batch=true label. * Update plugin_templ.go * fix: detection algorithm uses greedy target instead of divergent topology scores The detection loop tracked effective volume counts via an adjustments map, but createBalanceTask independently called planBalanceDestination which used the topology's LoadCount — a separate, unadjusted source of truth. This divergence caused multiple moves to pile onto the same server. Changes: - Add resolveBalanceDestination to resolve the detection loop's greedy target (minServer) rather than independently picking a destination - Add oscillation guard: stop when max-min <= 1 since no single move can improve the balance beyond that point - Track unseeded destinations: if a target server wasn't in the initial serverVolumeCounts, add it so subsequent iterations include it - Add TestDetection_UnseededDestinationDoesNotOverload * fix: handler force_move propagation, partial failure, deterministic dedupe - Propagate ForceMove from outer BalanceTaskParams to individual move TaskParams so batch moves respect the force_move flag - Fix partial failure: mark job successful if at least one move succeeded (succeeded > 0 || failed == 0) to avoid re-running already-completed moves on retry - Use SHA-256 hash for deterministic dedupe key fallback instead of time.Now().UnixNano() which is non-deterministic - Remove unused successDetails variable - Extract maxProposalStringLength constant to replace magic number 200 * admin UI: use template literals in balance execution plan rendering * fix: integration test handles batch proposals from batched detection With batch_size=20, all moves are grouped into a single proposal containing BalanceParams.Moves instead of top-level Sources/Targets. Update assertions to handle both batch and single-move proposal formats. * fix: verify volume size on target before deleting source during balance Add a pre-delete safety check that reads the volume file status on both source and target, then compares .dat file size and file count. If they don't match, the move is aborted — leaving the source intact rather than risking irreversible data loss. Also removes the redundant mountVolume call since VolumeCopy already mounts the volume on the target server. * fix: clamp maxConcurrent, serialize progress sends, validate config as int64 - Clamp maxConcurrentMoves to defaultMaxConcurrentMoves before creating the semaphore so a stale or malicious job cannot request unbounded concurrent volume moves - Extend progressMu to cover sender.SendProgress calls since the underlying gRPC stream is not safe for concurrent writes - Perform bounds checks on max_concurrent_moves and batch_size in int64 space before casting to int, avoiding potential overflow on 32-bit * fix: check disk capacity in resolveBalanceDestination Skip disks where VolumeCount >= MaxVolumeCount so the detection loop does not propose moves to a full disk that would fail at execution time. * test: rename unseeded destination test to match actual behavior The test exercises a server with 0 volumes that IS seeded from topology (matching disk type), not an unseeded destination. Rename to TestDetection_ZeroVolumeServerIncludedInBalance and fix comments. * test: tighten integration test to assert exactly one batch proposal With default batch_size=20, all moves should be grouped into a single batch proposal. Assert len(proposals)==1 and require BalanceParams with Moves, removing the legacy single-move else branch. * fix: propagate ctx to RPCs and restore source writability on abort - All helper methods (markVolumeReadonly, copyVolume, tailVolume, readVolumeFileStatus, deleteVolume) now accept a context parameter instead of using context.Background(), so Execute's ctx propagates cancellation and timeouts into every volume server RPC - Add deferred cleanup that restores the source volume to writable if any step after markVolumeReadonly fails, preventing the source from being left permanently readonly on abort - Add markVolumeWritable helper using VolumeMarkWritableRequest * fix: deep-copy protobuf messages in test recording sender Use proto.Clone in recordingExecutionSender to store immutable snapshots of JobProgressUpdate and JobCompleted, preventing assertions from observing mutations if the handler reuses message pointers. * fix: add VolumeMarkWritable and ReadVolumeFileStatus to fake volume server The balance task now calls ReadVolumeFileStatus for pre-delete verification and VolumeMarkWritable to restore writability on abort. Add both RPCs to the test fake, and drop the mountCalls assertion since BalanceTask no longer calls VolumeMount directly (VolumeCopy handles it). * fix: use maxConcurrentMovesLimit (50) for clamp, not defaultMaxConcurrentMoves defaultMaxConcurrentMoves (5) is the fallback when the field is unset, not an upper bound. Clamping to it silently overrides valid config values like 10/20/50. Introduce maxConcurrentMovesLimit (50) matching the descriptor's MaxValue and clamp to that instead. * fix: cancel batch moves on progress stream failure Derive a cancellable batchCtx from the caller's ctx. If sender.SendProgress returns an error (client disconnect, context cancelled), capture it, skip further sends, and cancel batchCtx so in-flight moves abort via their propagated context rather than running blind to completion. * fix: bound cleanup timeout and validate batch move fields - Use a 30-second timeout for the deferred markVolumeWritable cleanup instead of context.Background() which can block indefinitely if the volume server is unreachable - Validate required fields (VolumeID, SourceNode, TargetNode) before appending moves to a batch proposal, skipping invalid entries - Fall back to a single-move proposal when filtering leaves only one valid move in a batch * fix: cancel task execution on SendProgress stream failure All handler progress callbacks previously ignored SendProgress errors, allowing tasks to continue executing after the client disconnected. Now each handler creates a derived cancellable context and cancels it on the first SendProgress error, stopping the in-flight task promptly. Handlers fixed: erasure_coding, vacuum, volume_balance (single-move), and admin_script (breaks command loop on send failure). * fix: validate batch moves before scheduling in executeBatchMoves Reject empty batches, enforce a hard upper bound (100 moves), and filter out nil or incomplete move specs (missing source/target/volume) before allocating progress tracking and launching goroutines. * test: add batch balance execution integration test Tests the batch move path with 3 volumes, max concurrency 2, using fake volume servers. Verifies all moves complete with correct readonly, copy, tail, and delete RPC counts. * test: add MarkWritableCount and ReadFileStatusCount accessors Expose the markWritableCalls and readFileStatusCalls counters on the fake volume server, following the existing MarkReadonlyCount pattern. * fix: oscillation guard uses global effective counts for heterogeneous capacity The oscillation guard (max-min <= 1) previously used maxServer/minServer which are determined by utilization ratio. With heterogeneous capacity, maxServer by utilization can have fewer raw volumes than minServer, producing a negative diff and incorrectly triggering the guard. Now scans all servers' effective counts to find the true global max/min volume counts, so the guard works correctly regardless of whether utilization-based or raw-count balancing is used. * fix: admin script handler breaks outer loop on SendProgress failure The break on SendProgress error inside the shell.Commands scan only exited the inner loop, letting the outer command loop continue executing commands on a broken stream. Use a sendBroken flag to propagate the break to the outer execCommands loop. |
1 week ago |
|
|
b991acf634
|
fix: paginate bucket listing in Admin UI to show all buckets (#8585)
* fix: paginate bucket listing in Admin UI to show all buckets The Admin UI's GetS3Buckets() had a hardcoded Limit of 1000 in the ListEntries request, causing the Total Buckets count to cap at 1000 even when more buckets exist. This adds pagination to iterate through all buckets by continuing from the last entry name when a full page is returned. Fixes seaweedfs/seaweedfs#8564 * feat: add server-side pagination and sorting to S3 buckets page Add pagination controls, page size selector, and sortable column headers to the Admin UI's Object Store buckets page, following the same pattern used by the Cluster Volumes page. This ensures the UI remains responsive with thousands of buckets. - Add CurrentPage, TotalPages, PageSize, SortBy, SortOrder to S3BucketsData - Accept page/pageSize/sortBy/sortOrder query params in ShowS3Buckets handler - Sort buckets by name, owner, created, objects, logical/physical size - Paginate results server-side (default 100 per page) - Add pagination nav, page size dropdown, and sort indicators to template * Update s3_buckets_templ.go * Update object_store_users_templ.go * fix: use errors.Is(err, io.EOF) instead of string comparison Replace brittle err.Error() == "EOF" string comparison with idiomatic errors.Is(err, io.EOF) for checking stream end in bucket listing. * fix: address PR review findings for bucket pagination - Clamp page to totalPages when page exceeds total, preventing empty results with misleading pagination state - Fix sort comparator to use explicit ascending/descending comparisons with a name tie-breaker, satisfying strict weak ordering for sort.Slice - Capture SnapshotTsNs from first ListEntries response and pass it to subsequent requests for consistent pagination across pages - Replace non-focusable <th onclick> sort headers with <a> tags and reuse getSortIcon, matching the cluster_volumes accessibility pattern - Change exportBucketList() to fetch all buckets from /api/s3/buckets instead of scraping DOM rows (which now only contain the current page) |
1 week ago |
|
|
02d3e3195c |
Update object_store_users_templ.go
|
1 week ago |
|
|
470075dd90
|
admin/balance: fix Max Volumes display and balancer source selection (#8583)
* admin: fix Max Volumes column always showing 0 GetClusterVolumeServers() computed DiskCapacity from diskInfo.MaxVolumeCount but never populated the MaxVolumes field on the VolumeServer struct, causing the column to always display 0. * balance: use utilization ratio for source server selection The balancer selected the source server (to move volumes FROM) by raw volume count. In clusters with heterogeneous MaxVolumeCount settings, the server with the highest capacity naturally holds the most volumes and was always picked as the source, even when it had the lowest utilization ratio. Change source selection and imbalance calculation to use utilization ratio (effectiveCount / maxVolumeCount) so servers are compared by how full they are relative to their capacity, not by absolute volume count. This matches how destination scoring already works via calculateBalanceScore(). |
1 week ago |
|
|
f8b7357350
|
weed/server: fix dropped error (#8584)
* weed/server: fix dropped error * Removed the redundant check. --------- Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> Co-authored-by: Chris Lu <chris.lu@gmail.com> |
1 week ago |
|
|
6dab90472b
|
admin: fix access key creation UX (#8579)
* admin: remove misleading "secret key only shown once" warning
The access key details modal already allows viewing both the access key
and secret key at any time, so the warning about the secret key only
being displayed once is incorrect and misleading.
* admin: allow specifying custom access key and secret key
Add optional access_key and secret_key fields to the create access key
API. When provided, the specified keys are used instead of generating
random ones. The UI now shows a form with optional fields when creating
a new key, with a note that leaving them blank auto-generates keys.
* admin: check access key uniqueness before creating
Access keys must be globally unique across all users since S3 auth
looks them up in a single global map. Add an explicit check using
GetUserByAccessKey before creating, so the user gets a clear error
("access key is already in use") rather than a generic store error.
* Update object_store_users_templ.go
* admin: address review feedback for access key creation
Handler:
- Use decodeJSONBody/newJSONMaxReader instead of raw json.Decode to
enforce request size limits and handle malformed JSON properly
- Return 409 Conflict for duplicate access keys, 400 Bad Request for
validation errors, instead of generic 500
Backend:
- Validate access key length (4-128 chars) and secret key length
(8-128 chars) when user-provided
Frontend:
- Extract resetCreateKeyForm() helper to avoid duplicated cleanup logic
- Wire resetCreateKeyForm to accessKeysModal hidden.bs.modal event so
form state is always cleared when modal is dismissed
- Change secret key input to type="password" with a visibility toggle
* admin: guard against nil request and handle GetUserByAccessKey errors
- Add nil check for the CreateAccessKeyRequest pointer before
dereferencing, defaulting to an empty request (auto-generate both
keys).
- Handle non-"not found" errors from GetUserByAccessKey explicitly
instead of silently proceeding, so store errors (e.g. db connection
failures) surface rather than being swallowed.
* Update object_store_users_templ.go
* admin: fix access key uniqueness check with gRPC store
GetUserByAccessKey returns a gRPC NotFound status error (not the
sentinel credential.ErrAccessKeyNotFound) when using the gRPC store,
causing the uniqueness check to fail with a spurious error.
Treat the lookup as best-effort: only reject when a user is found
(err == nil). Any error (not-found via any store, connectivity issues)
falls through to the store's own CreateAccessKey which enforces
uniqueness definitively.
* admin: fix error handling and input validation for access key creation
Backend:
- Remove access key value from the duplicate-key error message to avoid
logging the caller-supplied identifier.
Handler:
- Handle empty POST body (io.EOF) as a valid request that auto-generates
both keys, instead of rejecting it as malformed JSON.
- Return 404 for "not found" errors (e.g. non-existent user) instead of
collapsing them into a 500.
Frontend:
- Add minlength/maxlength attributes matching backend constraints
(access key 4-128, secret key 8-128).
- Call reportValidity() before submitting so invalid lengths are caught
client-side without a round trip.
* admin: use sentinel errors and fix GetUserByAccessKey error handling
Backend (user_management.go):
- Define sentinel errors (ErrAccessKeyInUse, ErrUserNotFound,
ErrInvalidInput) and wrap them in returned errors so callers can use
errors.Is.
- Handle GetUserByAccessKey errors properly: check the sentinel
credential.ErrAccessKeyNotFound first, then fall back to string
matching for stores (gRPC) that return non-sentinel not-found errors.
Surface unexpected errors instead of silently proceeding.
Handler (user_handlers.go):
- Replace fragile strings.Contains error matching with errors.Is
against the new dash sentinels.
Frontend (object_store_users.templ):
- Add double-submit guard (isCreatingKey flag + button disabling) to
prevent duplicate access key creation requests.
|
1 week ago |
|
|
f8d783f80e
|
fix: ListObjectVersions interleave Version and DeleteMarker in sort order (#8567)
* fix: ListObjectVersions interleave Version and DeleteMarker in sort order Go's default xml.Marshal serializes struct fields in definition order, causing all <Version> elements to appear before all <DeleteMarker> elements. The S3 API contract requires these elements to be interleaved in the correct global sort order (by key ascending, then newest version first within each key). This broke clients that validate version list ordering within a single key — an older Version would appear before a newer DeleteMarker for the same object. Fix: Replace the separate Versions/DeleteMarkers/CommonPrefixes arrays with a single Entries []VersionListEntry slice. Each VersionListEntry uses a per-element MarshalXML that outputs the correct XML tag name (<Version>, <DeleteMarker>, or <CommonPrefixes>) based on which field is populated. Since the entries are already in their correct sorted order from buildSortedCombinedList, the XML output is automatically interleaved correctly. Also removes the unused ListObjectVersionsResult struct. Note: The reporter also mentioned a cross-key timestamp ordering issue when paginating with max-keys=1, but that is correct S3 behavior — ListObjectVersions sorts by key name (ascending), not by timestamp. Different keys having non-monotonic timestamps is expected. * test: add CommonPrefixes XML marshaling coverage for ListObjectVersions * fix: validate VersionListEntry has exactly one field set in MarshalXML Return an error instead of silently emitting an empty <Version> element when no field (or multiple fields) are populated. Also clean up the misleading xml:"Version" struct tag on the Entries field. |
1 week ago |