seaweedfs

Commit Graph

Author	SHA1	Message	Date
Chris Lu	51ec0d2122	fix(remote_gateway): prevent double-versioning when syncing to versioned central bucket (#8710 ) * fix(remote_gateway): prevent double-versioning when syncing to versioned central bucket When a file is uploaded to a versioned bucket on edge, SeaweedFS stores it internally as {object}.versions/v_{versionId}. The remote_gateway was syncing this internal path directly to the central S3 endpoint. When central's bucket also has versioning enabled, this caused central to apply its own versioning on top, producing corrupt paths like: object.versions/v_{edgeId}.versions/v_{centralId} Fix: rewrite internal .versions/v_{id} paths to the original S3 object key before uploading to the remote. Skip version file delete/update events that are internal bookkeeping. Fixes https://github.com/seaweedfs/seaweedfs/discussions/8481#discussioncomment-16209342 * fix(remote_gateway): propagate delete markers to remote as deletions Delete markers are zero-content version entries (ExtDeleteMarkerKey=true) created by S3 DELETE on a versioned bucket. Previously they were silently dropped by the HasData() filter, so deletions on edge never reached central. Now: detect delete markers before the HasData check, rewrite the .versions path to the original S3 key, and issue client.DeleteFile() on the remote. * fix(remote_gateway): tighten isVersionedPath to avoid false positives Address PR review feedback: - Add isDir parameter to isVersionedPath so it only matches the exact internal shapes: directories whose name ends with .versions (isDir=true), and files with the v_ prefix inside a .versions parent (isDir=false). Previously the function was too broad and could match user-created paths like "my.versions/data.txt". - Update all 4 call sites to pass the entry's IsDirectory field. - Rename TestVersionedDirectoryNotFilteredByHasData to TestVersionsDirectoryFilteredByHasData so the name reflects the actual assertion (directories ARE filtered by HasData). - Expand TestIsVersionedPath with isDir cases and false-positive checks. * fix(remote_gateway): persist sync marker after delete-marker propagation The delete-marker branch was calling client.DeleteFile() and returning without updating the local entry, making event replay re-issue the remote delete. Now call updateLocalEntry after a successful DeleteFile to stamp the delete-marker entry with a RemoteEntry, matching the pattern used by the normal create path. * refactor(remote_gateway): extract syncDeleteMarker and fix root path edge case - Extract syncDeleteMarker() shared helper used by both bucketed and mounted-dir event processors, replacing the duplicated delete + persist local marker logic. - Fix rewriteVersionedSourcePath for root-level objects: when lastSlash is 0 (e.g. "/file.xml.versions"), return "/" as the parent dir instead of an empty string. - The strings.Contains(dir, ".versions/") condition flagged in review was already removed in a prior commit that tightened isVersionedPath. * fix(remote_gateway): skip updateLocalEntry for versioned path rewrites After rewriting a .versions/v_{id} path to the logical S3 key and uploading, the code was calling updateLocalEntry on the original v_* entry, stamping it with a RemoteEntry for the logical key. This is semantically wrong: the logical object has no filer entry in versioned buckets, and the internal v_* entry should not carry a RemoteEntry for a different path. Skip updateLocalEntry when the path was rewritten from a versioned source. Replay safety is preserved because S3 PutObject is idempotent. * fix(remote_gateway): scope versioning checks to /buckets/ namespace isVersionedPath and rewriteVersionedSourcePath could wrongly match paths in non-bucket mounts (e.g. /mnt/remote/file.xml.versions). Add the same /buckets/ prefix guard used by isMultipartUploadDir so the .versions / v_ logic only applies within the bucket namespace.	1 day ago
Chris Lu	15f4a97029	fix: improve raft leader election reliability and failover speed (#8692 ) * fix: clear raft vote state file on non-resume startup The seaweedfs/raft library v1.1.7 added a persistent `state` file for currentTerm and votedFor. When RaftResumeState=false (the default), the log, conf, and snapshot directories are cleared but this state file was not. On repeated restarts, different masters accumulate divergent terms, causing AppendEntries rejections and preventing leader election. Fixes #8690 * fix: recover TopologyId from snapshot before clearing raft state When RaftResumeState=false clears log/conf/snapshot, the TopologyId (used for license validation) was lost. Now extract it from the latest snapshot before cleanup and restore it on the topology. Both seaweedfs/raft and hashicorp/raft paths are handled, with a shared recoverTopologyIdFromState helper in raft_common.go. * fix: stagger multi-master bootstrap delay by peer index Previously all masters used a fixed 1500ms delay before the bootstrap check. Now the delay is proportional to the peer's sorted index with randomization (matching the hashicorp raft path), giving the designated bootstrap node (peer 0) a head start while later peers wait for gRPC servers to be ready. Also adds diagnostic logging showing why DoJoinCommand was or wasn't called, making leader election issues easier to diagnose from logs. * fix: skip unreachable masters during leader reconnection When a master leader goes down, non-leader masters still redirect clients to the stale leader address. The masterClient would follow these redirects, fail, and retry — wasting round-trips each cycle. Now tryAllMasters tracks which masters failed within a cycle and skips redirects pointing to them, reducing log spam and connection overhead during leader failover. * fix: take snapshot after TopologyId generation for recovery After generating a new TopologyId on the leader, immediately take a raft snapshot so the ID can be recovered from the snapshot on future restarts with RaftResumeState=false. Without this, short-lived clusters would lose the TopologyId on restart since no automatic snapshot had been taken yet. * test: add multi-master raft failover integration tests Integration test framework and 5 test scenarios for 3-node master clusters: - TestLeaderConsistencyAcrossNodes: all nodes agree on leader and TopologyId - TestLeaderDownAndRecoverQuickly: leader stops, new leader elected, old leader rejoins as follower - TestLeaderDownSlowRecover: leader gone for extended period, cluster continues with 2/3 quorum - TestTwoMastersDownAndRestart: quorum lost (2/3 down), recovered when both restart - TestAllMastersDownAndRestart: full cluster restart, leader elected, all nodes agree on TopologyId * fix: address PR review comments - peerIndex: return -1 (not 0) when self not found, add warning log - recoverTopologyIdFromSnapshot: defer dir.Close() - tests: check GetTopologyId errors instead of discarding them * fix: address review comments on failover tests - Assert no leader after quorum loss (was only logging) - Verify follower cs.Leader matches expected leader via ServerAddress.ToHttpAddress() comparison - Check GetTopologyId error in TestTwoMastersDownAndRestart	2 days ago
Weihao Jiang	01987bcafd	Make `weed-fuse` compatible with systemd-based mount (#6814 ) * Make weed-fuse compatible with systemd-mount series * fix: add missing type annotation on skipAutofs param in FreeBSD build The parameter was declared without a type, causing a compile error on FreeBSD. * fix: guard hasAutofs nil dereference and make FsName conditional on autofs mode - Check option.hasAutofs for nil before dereferencing to prevent panic when RunMount is called without the flag initialized. - Only set FsName to "fuse" when autofs mode is active; otherwise preserve the descriptive server:path name for mount/df output. - Fix typo: recogize -> recognize. * fix: consistent error handling for autofs option and log ignored _netdev - Replace panic with fmt.Fprintf+return false for autofs parse errors, matching the pattern used by other fuse option parsers. - Log when _netdev option is silently stripped to aid debugging. --------- Co-authored-by: Chris Lu <chris.lu@gmail.com>	3 days ago
Chris Lu	81369b8a83	improve: large file sync throughput for remote.cache and filer.sync (#8676 ) * improve large file sync throughput for remote.cache and filer.sync Three main throughput improvements: 1. Adaptive chunk sizing for remote.cache: targets ~32 chunks per file instead of always starting at 5MB. A 500MB file now uses ~16MB chunks (32 chunks) instead of 5MB chunks (100 chunks), reducing per-chunk overhead (volume assign, gRPC call, needle write) by 3x. 2. Configurable concurrency at every layer: - remote.cache chunk concurrency: -chunkConcurrency flag (default 8) - remote.cache S3 download concurrency: -downloadConcurrency flag (default raised from 1 to 5 per chunk) - filer.sync chunk concurrency: -chunkConcurrency flag (default 32) 3. S3 multipart download concurrency raised from 1 to 5: the S3 manager downloader was using Concurrency=1, serializing all part downloads within each chunk. This alone can 5x per-chunk download speed. The concurrency values flow through the gRPC request chain: shell command → CacheRemoteObjectToLocalClusterRequest → FetchAndWriteNeedleRequest → S3 downloader Zero values in the request mean "use server defaults", maintaining full backward compatibility with existing callers. Ref #8481 * fix: use full maxMB for chunk size cap and remove loop guard Address review feedback: - Use full maxMB instead of maxMB/2 for maxChunkSize to avoid unnecessarily limiting chunk size for very large files. - Remove chunkSize < maxChunkSize guard from the safety loop so it can always grow past maxChunkSize when needed to stay under 1000 chunks (e.g., extremely large files with small maxMB). * address review feedback: help text, validation, naming, docs - Fix help text for -chunkConcurrency and -downloadConcurrency flags to say "0 = server default" instead of advertising specific numeric defaults that could drift from the server implementation. - Validate chunkConcurrency and downloadConcurrency are within int32 range before narrowing, returning a user-facing error if out of range. - Rename ReadRemoteErr to readRemoteErr to follow Go naming conventions. - Add doc comment to SetChunkConcurrency noting it must be called during initialization before replication goroutines start. - Replace doubling loop in chunk size safety check with direct ceil(remoteSize/1000) computation to guarantee the 1000-chunk cap. * address Copilot review: clamp concurrency, fix chunk count, clarify proto docs - Use ceiling division for chunk count check to avoid overcounting when file size is an exact multiple of chunk size. - Clamp chunkConcurrency (max 1024) and downloadConcurrency (max 1024 at filer, max 64 at volume server) to prevent excessive goroutines. - Always use ReadFileWithConcurrency when the client supports it, falling back to the implementation's default when value is 0. - Clarify proto comments that download_concurrency only applies when the remote storage client supports it (currently S3). - Include specific server defaults in help text (e.g., "0 = server default 8") so users see the actual values in -h output. * fix data race on executionErr and use %w for error wrapping - Protect concurrent writes to executionErr in remote.cache worker goroutines with a sync.Mutex to eliminate the data race. - Use %w instead of %v in volume_grpc_remote.go error formatting to preserve the error chain for errors.Is/errors.As callers.	3 days ago
Chris Lu	e8914ac879	feat(admin): add -urlPrefix flag for subdirectory deployment (#8670 ) Allow the admin server to run behind a reverse proxy under a subdirectory by adding a -urlPrefix flag (e.g. -urlPrefix=/seaweedfs). Closes #8646	5 days ago
Chris Lu	8cde3d4486	Add data file compaction to iceberg maintenance (Phase 2) (#8503 ) * Add iceberg_maintenance plugin worker handler (Phase 1) Implement automated Iceberg table maintenance as a new plugin worker job type. The handler scans S3 table buckets for tables needing maintenance and executes operations in the correct Iceberg order: expire snapshots, remove orphan files, and rewrite manifests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add data file compaction to iceberg maintenance handler (Phase 2) Implement bin-packing compaction for small Parquet data files: - Enumerate data files from manifests, group by partition - Merge small files using parquet-go (read rows, write merged output) - Create new manifest with ADDED/DELETED/EXISTING entries - Commit new snapshot with compaction metadata Add 'compact' operation to maintenance order (runs before expire_snapshots), configurable via target_file_size_bytes and min_input_files thresholds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix memory exhaustion in mergeParquetFiles by processing files sequentially Previously all source Parquet files were loaded into memory simultaneously, risking OOM when a compaction bin contained many small files. Now each file is loaded, its rows are streamed into the output writer, and its data is released before the next file is loaded — keeping peak memory proportional to one input file plus the output buffer. * Validate bucket/namespace/table names against path traversal Reject names containing '..', '/', or '\' in Execute to prevent directory traversal via crafted job parameters. * Add filer address failover in iceberg maintenance handler Try each filer address from cluster context in order instead of only using the first one. This improves resilience when the primary filer is temporarily unreachable. * Add separate MinManifestsToRewrite config for manifest rewrite threshold The rewrite_manifests operation was reusing MinInputFiles (meant for compaction bin file counts) as its manifest count threshold. Add a dedicated MinManifestsToRewrite field with its own config UI section and default value (5) so the two thresholds can be tuned independently. * Fix risky mtime fallback in orphan removal that could delete new files When entry.Attributes is nil, mtime defaulted to Unix epoch (1970), which would always be older than the safety threshold, causing the file to be treated as eligible for deletion. Skip entries with nil Attributes instead, matching the safer logic in operations.go. * Fix undefined function references in iceberg_maintenance_handler.go Use the exported function names (ShouldSkipDetectionByInterval, BuildDetectorActivity, BuildExecutorActivity) matching their definitions in vacuum_handler.go. * Remove duplicated iceberg maintenance handler in favor of iceberg/ subpackage The IcebergMaintenanceHandler and its compaction code in the parent pluginworker package duplicated the logic already present in the iceberg/ subpackage (which self-registers via init()). The old code lacked stale-plan guards, proper path normalization, CAS-based xattr updates, and error-returning parseOperations. Since the registry pattern (default "all") makes the old handler unreachable, remove it entirely. All functionality is provided by iceberg.Handler with the reviewed improvements. * Fix MinManifestsToRewrite clamping to match UI minimum of 2 The clamp reset values below 2 to the default of 5, contradicting the UI's advertised MinValue of 2. Clamp to 2 instead. * Sort entries by size descending in splitOversizedBin for better packing Entries were processed in insertion order which is non-deterministic from map iteration. Sorting largest-first before the splitting loop improves bin packing efficiency by filling bins more evenly. * Add context cancellation check to drainReader loop The row-streaming loop in drainReader did not check ctx between iterations, making long compaction merges uncancellable. Check ctx.Done() at the top of each iteration. * Fix splitOversizedBin to always respect targetSize limit The minFiles check in the split condition allowed bins to grow past targetSize when they had fewer than minFiles entries, defeating the OOM protection. Now bins always split at targetSize, and a trailing runt with fewer than minFiles entries is merged into the previous bin. * Add integration tests for iceberg table maintenance plugin worker Tests start a real weed mini cluster, create S3 buckets and Iceberg table metadata via filer gRPC, then exercise the iceberg.Handler operations (ExpireSnapshots, RemoveOrphans, RewriteManifests) against the live filer. A full maintenance cycle test runs all operations in sequence and verifies metadata consistency. Also adds exported method wrappers (testing_api.go) so the integration test package can call the unexported handler methods. * Fix splitOversizedBin dropping files and add source path to drainReader errors The runt-merge step could leave leading bins with fewer than minFiles entries (e.g. [80,80,10,10] with targetSize=100, minFiles=2 would drop the first 80-byte file). Replace the filter-based approach with an iterative merge that folds any sub-minFiles bin into its smallest neighbor, preserving all eligible files. Also add the source file path to drainReader error messages so callers can identify which Parquet file caused a read/write failure. * Harden integration test error handling - s3put: fail immediately on HTTP 4xx/5xx instead of logging and continuing - lookupEntry: distinguish NotFound (return nil) from unexpected RPC errors (fail the test) - writeOrphan and orphan creation in FullMaintenanceCycle: check CreateEntryResponse.Error in addition to the RPC error * go fmt --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	6 days ago
Chris Lu	47799a5b4f	fix tests	6 days ago
Chris Lu	1f2014568f	fix(mini): use "all" job type for plugin worker (#8634 ) The mini command previously hardcoded a list of specific job types (vacuum, volume_balance, erasure_coding, admin_script). Use the "all" category instead so that newly registered handlers are automatically picked up without requiring changes to the mini command.	6 days ago
Chris Lu	a838661b83	feat(plugin): EC shard balance handler for plugin worker (#8629 ) * feat(ec_balance): add TaskTypeECBalance constant and protobuf definitions Add the ec_balance task type constant to both topology and worker type systems. Define EcBalanceTaskParams, EcShardMoveSpec, and EcBalanceTaskConfig protobuf messages for EC shard balance operations. * feat(ec_balance): add configuration for EC shard balance task Config includes imbalance threshold, min server count, collection filter, disk type, and preferred tags for tag-aware placement. * feat(ec_balance): add multi-phase EC shard balance detection algorithm Implements four detection phases adapted from the ec.balance shell command: 1. Duplicate shard detection and removal proposals 2. Cross-rack shard distribution balancing 3. Within-rack node-level shard balancing 4. Global shard count equalization across nodes Detection is side-effect-free: it builds an EC topology view from ActiveTopology and generates move proposals without executing them. * feat(ec_balance): add EC shard move task execution Implements the shard move sequence using the same VolumeEcShardsCopy, VolumeEcShardsMount, VolumeEcShardsUnmount, and VolumeEcShardsDelete RPCs as the shell ec.balance command. Supports both regular shard moves and dedup-phase deletions (unmount+delete without copy). * feat(ec_balance): add task registration and scheduling Register EC balance task definition with auto-config update support. Scheduling respects max concurrent limits and worker capabilities. * feat(ec_balance): add plugin handler for EC shard balance Implements the full plugin handler with detection, execution, admin and worker config forms, proposal building, and decision trace reporting. Supports collection/DC/disk type filtering, preferred tag placement, and configurable detection intervals. Auto-registered via init() with the handler registry. * test(ec_balance): add tests for detection algorithm and plugin handler Detection tests cover: duplicate shard detection, cross-rack imbalance, within-rack imbalance, global rebalancing, topology building, collection filtering, and edge cases. Handler tests cover: config derivation with clamping, proposal building, protobuf encode/decode round-trip, fallback parameter decoding, capability, and config policy round-trip. * fix(ec_balance): address PR review feedback and fix CI test failure - Update TestWorkerDefaultJobTypes to expect 6 handlers (was 5) - Extract threshold constants (ecBalanceMinImbalanceThreshold, etc.) to eliminate magic numbers in Descriptor and config derivation - Remove duplicate ShardIdsToUint32 helper (use erasure_coding package) - Add bounds checks for int64→int/uint32 conversions to fix CodeQL integer conversion warnings * fix(ec_balance): address code review findings storage_impact.go: - Add TaskTypeECBalance case returning shard-level reservation (ShardSlots: -1/+1) instead of falling through to default which incorrectly reserves a full volume slot on target. detection.go: - Use dc:rack composite key to avoid cross-DC rack name collisions. Only create rack entries after confirming node has matching disks. - Add exceedsImbalanceThreshold check to cross-rack, within-rack, and global phases so trivial skews below the configured threshold are ignored. Dedup phase always runs since duplicates are errors. - Reserve destination capacity after each planned move (decrement destNode.freeSlots, update rackShardCount/nodeShardCount) to prevent overbooking the same destination. - Skip nodes with freeSlots <= 0 when selecting minNode in global balance to avoid proposing moves to full nodes. - Include loop index and source/target node IDs in TaskID to guarantee uniqueness across moves with the same volumeID/shardID. ec_balance_handler.go: - Fail fast with error when shard_id is absent in fallback parameter decoding instead of silently defaulting to shard 0. ec_balance_task.go: - Delegate GetProgress() to BaseTask.GetProgress() so progress updates from ReportProgressWithStage are visible to callers. - Add fail-fast guard rejecting multiple sources/targets until batch execution is implemented. Findings verified but not changed (matches existing codebase pattern in vacuum/balance/erasure_coding handlers): - register.go globalTaskDef.Config race: same unsynchronized pattern in all 4 task packages. - CreateTask using generated ID: same fmt.Sprintf pattern in all 4 task packages. * fix(ec_balance): harden parameter decoding, progress tracking, and validation ec_balance_handler.go (decodeECBalanceTaskParams): - Validate execution-critical fields (Sources[0].Node, ShardIds, Targets[0].Node, ShardIds) after protobuf deserialization. - Require source_disk_id and target_disk_id in legacy fallback path so Targets[0].DiskId is populated for VolumeEcShardsCopyRequest. - All error messages reference decodeECBalanceTaskParams and the specific missing field (TaskParams, shard_id, Targets[0].DiskId, EcBalanceTaskParams) for debuggability. ec_balance_task.go: - Track progress in ECBalanceTask.progress field, updated via reportProgress() helper called before ReportProgressWithStage(), so GetProgress() returns real stage progress instead of stale 0. - Validate: require exactly 1 source and 1 target (mirrors Execute guard), require ShardIds on both, with error messages referencing ECBalanceTask.Validate and the specific field. * fix(ec_balance): fix dedup execution path, stale topology, collection filter, timeout, and dedupeKey detection.go: - Dedup moves now set target=source so isDedupPhase() triggers the unmount+delete-only execution path instead of attempting a copy. - Apply moves to in-memory topology between phases via applyMovesToTopology() so subsequent phases see updated shard placement and don't conflict with already-planned moves. - detectGlobalImbalance now accepts allowedVids and filters both shard counting and shard selection to respect CollectionFilter. ec_balance_task.go: - Apply EcBalanceTaskParams.TimeoutSeconds to the context via context.WithTimeout so all RPC operations respect the configured timeout instead of hanging indefinitely. ec_balance_handler.go: - Include source node ID in dedupeKey so dedup deletions from different source nodes for the same shard aren't collapsed. - Clamp minServerCountRaw and minIntervalRaw lower bounds on int64 before narrowing to int, preventing undefined overflow on 32-bit. * fix(ec_balance): log warning before cancelling on progress send failure Log the error, job ID, job type, progress percentage, and stage before calling execCancel() in the progress callback so failed progress sends are diagnosable instead of silently cancelling.	6 days ago
Chris Lu	e4a77b8b16	feat(admin): support env var and security.toml for credentials (#8606 ) * feat(security): add [admin] section to security.toml scaffold Add admin credential fields (user, password, readonly.user, readonly.password) to security.toml. Via viper's WEED_ env prefix and AutomaticEnv(), these are automatically overridable as WEED_ADMIN_USER, WEED_ADMIN_PASSWORD, etc. Ref: https://github.com/seaweedfs/seaweedfs/discussions/8586 * feat(admin): support env var and security.toml fallbacks for credentials Add applyViperFallback() to read admin credentials from security.toml / WEED_* environment variables when CLI flags are not explicitly set. This allows systems like NixOS to pass secrets via env vars instead of CLI flags, which appear in process listings. Precedence: CLI flag > env var / security.toml > default value. Also change -adminUser default from "admin" to "" so that credentials are fully opt-in. Ref: https://github.com/seaweedfs/seaweedfs/discussions/8586 * feat(helm): use WEED_ env vars for admin credentials instead of CLI flags Rename SEAWEEDFS_ADMIN_USER/PASSWORD to WEED_ADMIN_USER/PASSWORD so viper picks them up natively. Remove -adminUser/-adminPassword shell expansion from command args since the Go binary now reads these directly via viper. * docs(admin): document env var and security.toml credential support Add environment variable mapping table, security.toml example, and precedence rules to the admin README. * style(security): use nested [admin.readonly] table in security.toml Use a nested TOML table instead of dotted keys for the readonly credentials. More idiomatic and easier to read; no change in how Viper parses it. * fix(admin): use util.GetViper() for env var support and fix README example applyViperFallback() was using viper.GetString() directly, which bypasses the WEED_ env prefix and AutomaticEnv setup that only happens in util.GetViper(). Switch to util.GetViper().GetString() so WEED_ADMIN_* environment variables are actually picked up. Also fix the README example to include WEED_ADMIN_USER alongside WEED_ADMIN_PASSWORD, since runAdmin() rejects an empty username when a password is set. * fix(admin): restore default adminUser to "admin" Defaulting adminUser to "" broke the common flow of setting only WEED_ADMIN_PASSWORD — runAdmin() rejects an empty username when a password is set. Restore "admin" as the default so that setting only the password works out of the box. * docs(admin): align README security.toml example with scaffold format Use nested [admin.readonly] table instead of flat dotted keys to match the format in weed/command/scaffold/security.toml. * docs(admin): remove README.md in favor of wiki page Admin documentation lives at the wiki (Admin-UI.md). Remove the in-repo README to avoid maintaining duplicate docs. --------- Co-authored-by: Copilot <copilot@github.com>	1 week ago
Chris Lu	737116e83c	fix port probing	1 week ago
Chris Lu	d4d2e511ed	for mini, default to bind all	2 weeks ago
Chris Lu	587c24ec89	plugin worker: support job type categories (all, default, heavy) (#8547 ) * plugin worker: add handler registry with job categories Introduce a self-registration pattern for plugin worker job handlers. Each handler can register itself via init() with a HandlerFactory that declares its job type, category (default/heavy), CLI aliases, and a builder function. ResolveHandlerFactories accepts a mix of category names ("all", "default", "heavy") and explicit job type names/aliases, returning the matching factories. This enables workers to be configured by resource profile rather than requiring explicit job type enumeration. * plugin worker: register all handlers via init() Each job handler now self-registers into the global handler registry with its canonical job type, category, CLI aliases, and build function: - vacuum: category=default - volume_balance: category=default - admin_script: category=default - erasure_coding: category=heavy - iceberg_maintenance: category=heavy Adding a new job type now only requires adding the init() call in the handler file itself — no other files need to be touched. * plugin worker: replace hardcoded job type switch with registry Remove buildPluginWorkerHandler, parsePluginWorkerJobTypes, and canonicalPluginWorkerJobType from worker_runtime.go. The simplified buildPluginWorkerHandlers now delegates to pluginworker.ResolveHandlerFactories, which resolves category names ("all", "default", "heavy") and explicit job type names/aliases. The default job type is changed from an explicit list to "all", so new handlers registered via init() are automatically picked up. Update all tests to use the new API. * plugin worker: update CLI help text for job categories Update the -jobType flag description and command examples to document category support (all, default, heavy) alongside explicit job type names. * plugin worker: address review feedback - Add CategoryAll constant; use typed constants in tokenAsCategory - Pre-allocate result slice in ResolveHandlerFactories - Add vacuum aliases (vol.vacuum, volume.vacuum) - List alias examples (ec, balance, iceberg) in -jobType flag help - Create handlers aggregator package for subpackage blank imports so new handler subpackages only need to be added in one place - Make category tests relationship-based (subset/union checks) instead of asserting exact handler counts - Add clarifying comments to worker_test.go and mini_plugin_test.go listing expected handler names next to count assertions --------- Co-authored-by: Copilot <copilot@github.com>	2 weeks ago
Chris Lu	72c2c7ef8b	Add iceberg_maintenance plugin worker handler (Phase 1) (#8501 ) * Add iceberg_maintenance plugin worker handler (Phase 1) Implement automated Iceberg table maintenance as a new plugin worker job type. The handler scans S3 table buckets for tables needing maintenance and executes operations in the correct Iceberg order: expire snapshots, remove orphan files, and rewrite manifests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix unsafe int64→int narrowing for MaxSnapshotsToKeep Use int64(wouldKeep) instead of int(config.MaxSnapshotsToKeep) to avoid potential truncation on 32-bit platforms (CodeQL high severity). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix unsafe int64→int narrowing for MinInputFiles Use int64(len(manifests)) instead of int(config.MinInputFiles) to avoid potential truncation on 32-bit platforms (CodeQL high severity). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix unsafe int64→int narrowing for MaxCommitRetries Clamp MaxCommitRetries to [1,20] range and keep as int64 throughout the retry loop to avoid truncation on 32-bit platforms (CodeQL high severity). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Sort snapshots explicitly by timestamp in expireSnapshots The previous logic relied on implicit ordering of the snapshot list. Now explicitly sorts snapshots by timestamp descending (most recent first) and uses a simpler keep-count loop: keep the first MaxSnapshotsToKeep newest snapshots plus the current snapshot unconditionally, then expire the rest that exceed the retention window. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Handle errors properly in listFilerEntries Previously all errors from ListEntries and Recv were silently swallowed. Now: treat "not found" errors as empty directory, propagate other ListEntries errors, and check for io.EOF explicitly on Recv instead of breaking on any error. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix overly broad HasSuffix check in orphan detection The bare strings.HasSuffix(ref, entry.Name) could match files with similar suffixes (e.g. "123.avro" matching "snap-123.avro"). Replaced with exact relPath match and a "/"-prefixed suffix check to avoid false positives. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Replace fmt.Sscanf with strconv.Atoi in extractMetadataVersion strconv.Atoi is more explicit and less fragile than fmt.Sscanf for parsing a simple integer from a trimmed string. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Recursively traverse directories for orphan file detection The orphan cleanup only listed a single directory level under data/ and metadata/, skipping IsDirectory entries. Partitioned Iceberg tables store data files in nested partition directories (e.g. data/region=us-east/file.parquet) which were never evaluated. Add walkFilerEntries helper that recursively descends into subdirectories, and use it in removeOrphans so all nested files are considered for orphan checks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix manifest path drift from double time.Now() calls rewriteManifests called time.Now().UnixMilli() twice: once for the path embedded in WriteManifest and once for the filename passed to saveFilerFile. These timestamps would differ, causing the manifest's internal path reference to not match the actual saved filename. Compute the filename once and reuse it for both WriteManifest and saveFilerFile so they always reference the same path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add TestManifestRewritePathConsistency test Verifies that WriteManifest returns a ManifestFile whose FilePath() matches the path passed in, and that path.Base() of that path matches the filename used for saveFilerFile. This validates the single- timestamp pattern used in rewriteManifests produces consistent paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Make parseOperations return error on unknown operations Previously parseOperations silently dropped unknown operation names and could return an empty list. Now validates inputs against the canonical set and returns a clear error if any unknown operation is specified. Updated Execute to surface the error instead of proceeding with an empty operation list. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Use gRPC status codes instead of string matching in listFilerEntries Replace brittle strings.Contains(err.Error(), "not found") check with status.Code(err) == codes.NotFound for proper gRPC error handling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add stale-plan guard in commit closures for expireSnapshots and rewriteManifests Both operations plan outside the commit mutation using a snapshot ID captured from the initial metadata read. If the table head advances concurrently, the mutation would create a snapshot parented to the wrong head or remove snapshots based on a stale view. Add a guard inside each mutation closure that verifies currentMeta.CurrentSnapshot().SnapshotID still matches the planned snapshot ID. If it differs, return errStalePlan which propagates immediately (not retried, since the plan itself is invalid). Also fix rewriteManifests to derive SequenceNumber from the fresh metadata (cs.SequenceNumber) instead of the captured currentSnap. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add compare-and-swap to updateTableMetadataXattr updateTableMetadataXattr previously re-read the entry but did not verify the metadataVersion matched what commitWithRetry had loaded. A concurrent update could be silently clobbered. Now accepts expectedVersion parameter and compares it against the stored metadataVersion before writing. Returns errMetadataVersionConflict on mismatch, which commitWithRetry treats as retryable (deletes the staged metadata file and retries with fresh state). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Export shared plugin worker helpers for use by sub-packages Export ShouldSkipDetectionByInterval, BuildExecutorActivity, and BuildDetectorActivity so the iceberg sub-package can reuse them without duplicating logic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Refactor iceberg maintenance handler into weed/plugin/worker/iceberg package Split the 1432-line iceberg_maintenance_handler.go into focused files in a new iceberg sub-package: handler.go, config.go, detection.go, operations.go, filer_io.go, and compact.go (Phase 2 data compaction). Key changes: - Rename types to drop stutter (IcebergMaintenanceHandler → Handler, etc.) - Fix loadFileByIcebergPath to preserve nested directory paths via normalizeIcebergPath instead of path.Base which dropped subdirectories - Check SendProgress errors instead of discarding them - Add stale-plan guard to compactDataFiles commitWithRetry closure - Add "compact" operation to parseOperations canonical order - Duplicate readStringConfig/readInt64Config helpers (~20 lines) - Update worker_runtime.go to import new iceberg sub-package Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Remove iceberg_maintenance from default plugin worker job types Iceberg maintenance is not yet ready to be enabled by default. Workers can still opt in by explicitly listing iceberg_maintenance in their job types configuration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Clamp config values to safe minimums in ParseConfig Prevents misconfiguration by enforcing minimum values using the default constants for all config fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Harden filer I/O: path helpers, strict CAS guard, path traversal prevention - Use path.Dir/path.Base instead of strings.SplitN in loadCurrentMetadata - Make CAS guard error on missing or unparseable metadataVersion - Add path.Clean and traversal validation in loadFileByIcebergPath Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix compact: single snapshot ID, oversized bin splitting, ensureFilerDir - Use single newSnapID for all manifest entries in a compaction run - Add splitOversizedBin to break bins exceeding targetSize - Make ensureFilerDir only create on NotFound, propagate other errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add wildcard filters, scan limit, and context cancellation to table scanning - Use wildcard matchers (, ?) for bucket/namespace/table filters - Add limit parameter to scanTablesForMaintenance for early termination - Add ctx.Done() checks in bucket and namespace scan loops - Update filter UI descriptions and placeholders for wildcard support Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Remove dead detection interval check and validate namespace parameter - Remove ineffective ShouldSkipDetectionByInterval call with hardcoded 0 - Add namespace to required parameter validation in Execute Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Improve operations: exponential backoff, orphan matching, full file cleanup - Use exponential backoff (50ms, 100ms, 200ms, ...) in commitWithRetry - Use normalizeIcebergPath for orphan matching instead of fragile suffix check - Add collectSnapshotFiles to traverse manifest lists → manifests → data files - Delete all unreferenced files after expiring snapshots, not just manifest lists - Refactor removeOrphans to reuse collectSnapshotFiles Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * iceberg: fix ensureFilerDir to handle filer_pb.ErrNotFound sentinel filer_pb.LookupEntry converts gRPC NotFound errors to filer_pb.ErrNotFound (a plain sentinel), so status.Code() never returns codes.NotFound for that error. This caused ensureFilerDir to return an error instead of creating the directory when it didn't exist. * iceberg: clean up orphaned artifacts when compaction commit fails Track all files written during compaction (merged data files, manifest, manifest list) and delete them if the commit or any subsequent write step fails, preventing orphaned files from accumulating in the filer. * iceberg: derive tablePath from namespace/tableName when empty An empty table_path parameter would be passed to maintenance operations unchecked. Default it to path.Join(namespace, tableName) when not provided. * iceberg: make collectSnapshotFiles return error on read/parse failure Previously, errors reading manifests were logged and skipped, returning a partial reference set. This could cause incorrect delete decisions during snapshot expiration or orphan cleanup. Now the function returns an error and all callers abort when reference data is incomplete. * iceberg: include active metadata file in removeOrphans referenced set The metadataFileName returned by loadCurrentMetadata was discarded, so the active metadata file could be incorrectly treated as an orphan and deleted. Capture it and add it to the referencedFiles map. * iceberg: only retry commitWithRetry on metadata version conflicts Previously all errors from updateTableMetadataXattr triggered retries. Now only errMetadataVersionConflict causes retry; other errors (permissions, transport, malformed xattr) fail immediately. * iceberg: respect req.Limit in fakeFilerServer.ListEntries mock The mock ListEntries ignored the Limit field, so tests couldn't exercise pagination. Now it stops streaming once Limit entries have been sent. * iceberg: validate parquet schema compatibility before merging files mergeParquetFiles now compares each source file's schema against the first file's schema and aborts with a clear error if they differ, instead of blindly writing rows that could panic or produce corrupt output. * iceberg: normalize empty JobType to canonical jobType in Execute events When request.Job.JobType is empty, status events and completion messages were emitted with a blank job type. Derive a canonical value early and use it consistently in all outbound events. * iceberg: log warning on unexpected config value types in read helpers readStringConfig and readInt64Config now log a V(1) warning when they encounter an unhandled ConfigValue kind, aiding debugging of unexpected config types that silently fall back to defaults. * worker: add iceberg_maintenance to default plugin worker job types Workers using the default job types list didn't advertise the iceberg_maintenance handler despite the handler and canonical name being registered. Add it so workers pick up the handler by default. * iceberg: use defer and detached context for compaction artifact cleanup The cleanup closure used the job context which could already be canceled, and was not called on ctx.Done() early exits. Switch to a deferred cleanup with a detached context (30s timeout) so artifact deletion completes on all exit paths including context cancellation. * iceberg: use proportional jitter in commitWithRetry backoff Fixed 25ms max jitter becomes insignificant at higher retry attempts. Use 0-20% of the current backoff value instead so jitter scales with the exponential delay. * iceberg: add malformed filename cases to extractMetadataVersion test Cover edge cases like "invalid.metadata.json", "metadata.json", "", and "v.metadata.json" to ensure the function returns 0 for unparseable inputs. * iceberg: fail compaction on manifest read errors and skip delete manifests Previously, unreadable manifests were silently skipped during compaction, which could drop live files from the entry set. Now manifest read/parse errors are returned as fatal errors. Also abort compaction when delete manifests exist since the compactor does not apply deletes — carrying them through unchanged could produce incorrect results. * iceberg: use table-relative path for active metadata file in orphan scan metadataFileName was stored as a basename (e.g. "v1.metadata.json") but the orphan scanner matches against table-relative paths like "metadata/v1.metadata.json". Prefix with "metadata/" so the active metadata file is correctly recognized as referenced. * iceberg: fix MetadataBuilderFromBase location to use metadata file path The second argument to MetadataBuilderFromBase records the previous metadata file in the metadata log. Using meta.Location() (the table root) was incorrect — it must be the actual metadata file path so old metadata files can be tracked and eventually cleaned up. * iceberg: update metadataLocation and versionToken in xattr on commit updateTableMetadataXattr was only updating metadataVersion, modifiedAt, and fullMetadata but not metadataLocation or versionToken. This left catalog state inconsistent after maintenance commits — the metadataLocation still pointed to the old metadata file and the versionToken was stale. Add a newMetadataLocation parameter and regenerate the versionToken on every commit, matching the S3 Tables handler behavior. * iceberg: group manifest entries by partition spec in rewriteManifests rewriteManifests was writing all entries into a single manifest using the table's current partition spec. For spec-evolved tables where manifests reference different partition specs, this produces an invalid manifest. Group entries by the source manifest's PartitionSpecID and write one merged manifest per spec, looking up each spec from the table's PartitionSpecs list. * iceberg: remove dead code loop for non-data manifests in compaction The early abort guard at the top of compactDataFiles already ensures no delete manifests are present. The loop that copied non-data manifests into allManifests was unreachable dead code. * iceberg: use JSON encoding in partitionKey for unambiguous grouping partitionKey used fmt.Sprintf("%d=%v") joined by commas, which produces ambiguous keys when partition values contain commas or '='. Use json.Marshal for values and NUL byte as separator to eliminate collisions. * iceberg: precompute normalized reference set in removeOrphans The orphan check was O(files × refs) because it normalized each reference path inside the per-file loop. Precompute the normalized set once for O(1) lookups per candidate file. * iceberg: add artifact cleanup to rewriteManifests on commit failure rewriteManifests writes merged manifests and a manifest list to the filer before committing but did not clean them up on failure. Add the same deferred cleanup pattern used by compactDataFiles: track written artifacts and delete them if the commit does not succeed. * iceberg: pass isDeleteData=true in deleteFilerFile deleteFilerFile called DoRemove with isDeleteData=false, which only removed filer metadata and left chunk data behind on volume servers. All other data-file deletion callers in the codebase pass true. * iceberg: clean up test: remove unused snapID, simplify TestDetectWithFakeFiler Remove unused snapID variable and eliminate the unnecessary second fake filer + entry copy in TestDetectWithFakeFiler by capturing the client from the first startFakeFiler call. * fix: update TestWorkerDefaultJobTypes to expect 5 job types The test expected 4 default job types but iceberg_maintenance was added as a 5th default in a previous commit. * iceberg: document client-side CAS TOCTOU limitation in updateTableMetadataXattr Add a note explaining the race window where two workers can both pass the version check and race at UpdateEntry. The proper fix requires server-side precondition support on UpdateEntryRequest. * iceberg: remove unused sender variable in TestFullExecuteFlow * iceberg: abort compaction when multiple partition specs are present The compactor writes all entries into a single manifest using the current partition spec, which is invalid for spec-evolved tables. Detect multiple PartitionSpecIDs and skip compaction until per-spec compaction is implemented. * iceberg: validate tablePath to prevent directory traversal Sanitize the table_path parameter with path.Clean and verify it matches the expected namespace/tableName prefix to prevent path traversal attacks via crafted job parameters. * iceberg: cap retry backoff at 5s and make it context-aware The exponential backoff could grow unbounded and blocked on time.Sleep ignoring context cancellation. Cap at 5s and use a timer with select on ctx.Done so retries respect cancellation. * iceberg: write manifest list with new snapshot identity in rewriteManifests The manifest list was written with the old snapshot's ID and sequence number, but the new snapshot created afterwards used a different identity. Compute newSnapshotID and newSeqNum before writing manifests and the manifest list so all artifacts are consistent. * ec: also remove .vif file in removeEcVolumeFiles removeEcVolumeFiles cleaned up .ecx, .ecj, and shard files but not the .vif volume info file, leaving it orphaned. The .vif file lives in the data directory alongside shard files. The directory handling for index vs data files was already correct: .ecx/.ecj are removed from IdxDirectory and shard files from Directory, matching how NewEcVolume loads them. Revert "ec: also remove .vif file in removeEcVolumeFiles" This reverts commit `acc82449e1`. * iceberg: skip orphan entries with nil Attributes instead of defaulting to epoch When entry.Attributes is nil, mtime defaulted to Unix epoch (1970), making unknown-age entries appear ancient and eligible for deletion. Skip these entries instead to avoid deleting files whose age cannot be determined. * iceberg: use unique metadata filenames to prevent concurrent write clobbering Add timestamp nonce to metadata filenames (e.g. v3-1709766000.metadata.json) so concurrent writers stage to distinct files. Update extractMetadataVersion to strip the nonce suffix, and loadCurrentMetadata to read the actual filename from the metadataLocation xattr field. * iceberg: defer artifact tracking until data file builder succeeds Move the writtenArtifacts append to after NewDataFileBuilder succeeds, so a failed builder doesn't leave a stale entry for an already-deleted file in the cleanup list. * iceberg: use detached context for metadata file cleanup Use context.WithTimeout(context.Background(), 10s) when deleting staged metadata files after CAS failure, so cleanup runs even if the original request context is canceled. * test: update default job types count to include iceberg_maintenance * iceberg: use parquet.EqualNodes for structural schema comparison Replace String()-based schema comparison with parquet.EqualNodes which correctly compares types, repetition levels, and logical types. * iceberg: add nonce-suffixed filename cases to TestExtractMetadataVersion * test: assert iceberg_maintenance is present in default job types * iceberg: validate operations config early in Detect Call parseOperations in Detect so typos in the operations config fail fast before emitting proposals, matching the validation already done in Execute. * iceberg: detect chunked files in loadFileByIcebergPath Return an explicit error when a file has chunks but no inline content, rather than silently returning empty data. Data files uploaded via S3 are stored as chunks, so compaction would otherwise produce corrupt merged files. --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2 weeks ago
Chris Lu	45ce18266a	Disable master maintenance scripts when admin server runs (#8499 ) * Disable master maintenance scripts when admin server runs * Stop defaulting master maintenance scripts * Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Clarify master scripts are disabled by default * Skip master maintenance scripts when admin server is connected * Restore default master maintenance scripts * Document admin server skip for master maintenance scripts --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2 weeks ago
Chris Lu	e1e5b4a8a6	add admin script worker (#8491 ) * admin: add plugin lock coordination * shell: allow bypassing lock checks * plugin worker: add admin script handler * mini: include admin_script in plugin defaults * admin script UI: drop name and enlarge text * admin script: add default script * admin_script: make run interval configurable * plugin: gate other jobs during admin_script runs * plugin: use last completed admin_script run * admin: backfill plugin config defaults * templ Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com> * comparable to default version Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com> * default to run Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com> * format Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com> * shell: respect pre-set noLock for fix.replication * shell: add force no-lock mode for admin scripts * volume balance worker already exists Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com> * admin: expose scheduler status JSON * shell: add sleep command * shell: restrict sleep syntax * Revert "shell: respect pre-set noLock for fix.replication" This reverts commit `2b14e8b826`. * templ Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com> * fix import Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com> * less logs Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com> * Reduce master client logs on canceled contexts * Update mini default job type count --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	3 weeks ago
Chris Lu	f5c35240be	Add volume dir tags and EC placement priority (#8472 ) * Add volume dir tags to topology Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add preferred tag config for EC Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Prioritize EC destinations by tags Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add EC placement planner tag tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Refactor EC placement tests to reuse buildActiveTopology Remove buildActiveTopologyWithDiskTags helper function and consolidate tag setup inline in test cases. Tests now use UpdateTopology to apply tags after topology creation, reusing the existing buildActiveTopology function rather than duplicating its logic. All tag scenario tests pass: - TestECPlacementPlannerPrefersTaggedDisks - TestECPlacementPlannerFallsBackWhenTagsInsufficient Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Consolidate normalizeTagList into shared util package Extract normalizeTagList from three locations (volume.go, detection.go, erasure_coding_handler.go) into new weed/util/tag.go as exported NormalizeTagList function. Replace all duplicate implementations with imports and calls to util.NormalizeTagList. This improves code reuse and maintainability by centralizing tag normalization logic. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add PreferredTags to EC config persistence Add preferred_tags field to ErasureCodingTaskConfig protobuf with field number 5. Update GetConfigSpec to include preferred_tags field in the UI configuration schema. Add PreferredTags to ToTaskPolicy to serialize config to protobuf. Add PreferredTags to FromTaskPolicy to deserialize from protobuf with defensive copy to prevent external mutation. This allows EC preferred tags to be persisted and restored across worker restarts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add defensive copy for Tags slice in DiskLocation Copy the incoming tags slice in NewDiskLocation instead of storing by reference. This prevents external callers from mutating the DiskLocation.Tags slice after construction, improving encapsulation and preventing unexpected changes to disk metadata. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add doc comment to buildCandidateSets method Document the tiered candidate selection and fallback behavior. Explain that for a planner with preferredTags, it accumulates disks matching each tag in order into progressively larger tiers, emits a candidate set once a tier reaches shardsNeeded, and finally falls back to the full candidates set if preferred-tag tiers are insufficient. This clarifies the intended semantics for future maintainers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Apply final PR review fixes 1. Update parseVolumeTags to replicate single tag entry to all folders instead of leaving some folders with nil tags. This prevents nil pointer dereferences when processing folders without explicit tags. 2. Add defensive copy in ToTaskPolicy for PreferredTags slice to match the pattern used in FromTaskPolicy, preventing external mutation of the returned TaskPolicy. 3. Add clarifying comment in buildCandidateSets explaining that the shardsNeeded <= 0 branch is a defensive check for direct callers, since selectDestinations guarantees shardsNeeded > 0. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix nil pointer dereference in parseVolumeTags Ensure all folder tags are initialized to either normalized tags or empty slices, not nil. When multiple tag entries are provided and there are more folders than entries, remaining folders now get empty slices instead of nil, preventing nil pointer dereference in downstream code. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix NormalizeTagList to return empty slice instead of nil Change NormalizeTagList to always return a non-nil slice. When all tags are empty or whitespace after normalization, return an empty slice instead of nil. This prevents nil pointer dereferences in downstream code that expects a valid (possibly empty) slice. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add nil safety check for v.tags pointer Add a safety check to handle the case where v.tags might be nil, preventing a nil pointer dereference. If v.tags is nil, use an empty string instead. This is defensive programming to prevent panics in edge cases. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add volume.tags flag to weed server and weed mini commands Add the volume.tags CLI option to both the 'weed server' and 'weed mini' commands. This allows users to specify disk tags when running the combined server modes, just like they can with 'weed volume'. The flag uses the same format and description as the volume command: comma-separated tag groups per data dir with ':' separators (e.g. fast:ssd,archive). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	3 weeks ago
Chris Lu	4f647e1036	Worker set its working directory (#8461 ) * set working directory * consolidate to worker directory * working directory * correct directory name * refactoring to use wildcard matcher * simplify * cleaning ec working directory * fix reference * clean * adjust test	3 weeks ago
blitt001	3d81d5bef7	Fix S3 signature verification behind reverse proxies (#8444 ) * Fix S3 signature verification behind reverse proxies When SeaweedFS is deployed behind a reverse proxy (e.g. nginx, Kong, Traefik), AWS S3 Signature V4 verification fails because the Host header the client signed with (e.g. "localhost:9000") differs from the Host header SeaweedFS receives on the backend (e.g. "seaweedfs:8333"). This commit adds a new -s3.externalUrl parameter (and S3_EXTERNAL_URL environment variable) that tells SeaweedFS what public-facing URL clients use to connect. When set, SeaweedFS uses this host value for signature verification instead of the Host header from the incoming request. New parameter: -s3.externalUrl (flag) or S3_EXTERNAL_URL (environment variable) Example: -s3.externalUrl=http://localhost:9000 Example: S3_EXTERNAL_URL=https://s3.example.com The environment variable is particularly useful in Docker/Kubernetes deployments where the external URL is injected via container config. The flag takes precedence over the environment variable when both are set. At startup, the URL is parsed and default ports are stripped to match AWS SDK behavior (port 80 for HTTP, port 443 for HTTPS), so "http://s3.example.com:80" and "http://s3.example.com" are equivalent. Bugs fixed: - Default port stripping was removed by a prior PR, causing signature mismatches when clients connect on standard ports (80/443) - X-Forwarded-Port was ignored when X-Forwarded-Host was not present - Scheme detection now uses proper precedence: X-Forwarded-Proto > TLS connection > URL scheme > "http" - Test expectations for standard port stripping were incorrect - expectedHost field in TestSignatureV4WithForwardedPort was declared but never actually checked (self-referential test) * Add Docker integration test for S3 proxy signature verification Docker Compose setup with nginx reverse proxy to validate that the -s3.externalUrl parameter (or S3_EXTERNAL_URL env var) correctly resolves S3 signature verification when SeaweedFS runs behind a proxy. The test uses nginx proxying port 9000 to SeaweedFS on port 8333, with X-Forwarded-Host/Port/Proto headers set. SeaweedFS is configured with -s3.externalUrl=http://localhost:9000 so it uses "localhost:9000" for signature verification, matching what the AWS CLI signs with. The test can be run with aws CLI on the host or without it by using the amazon/aws-cli Docker image with --network host. Test covers: create-bucket, list-buckets, put-object, head-object, list-objects-v2, get-object, content round-trip integrity, delete-object, and delete-bucket — all through the reverse proxy. * Create s3-proxy-signature-tests.yml * fix CLI * fix CI * Update s3-proxy-signature-tests.yml * address comments * Update Dockerfile * add user * no need for fuse * Update s3-proxy-signature-tests.yml * debug * weed mini * fix health check * health check * fix health checking --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Chris Lu <chris.lu@gmail.com>	3 weeks ago
Chris Lu	d2b92938ee	Make EC detection context aware (#8449 ) * Make EC detection context aware * Update register.go * Speed up EC detection planning * Add tests for EC detection planner * optimizations detection.go: extracted ParseCollectionFilter (exported) and feed it into the detection loop so both detection and tracing share the same parsing/whitelisting logic; the detection loop now iterates on a sorted list of volume IDs, checks the context at every iteration, and only sets hasMore when there are still unprocessed groups after hitting maxResults, keeping runtime bounded while still scheduling planned tasks before returning the results. erasure_coding_handler.go: dropped the duplicated inline filter parsing in emitErasureCodingDetectionDecisionTrace and now reuse erasurecodingtask.ParseCollectionFilter, and the summary suffix logic now only accounts for the hasMore case that can actually happen. detection_test.go: updated the helper topology builder to use master_pb.VolumeInformationMessage (matching the current protobuf types) and tightened the cancellation/max-results tests so they reliably exercise the detection logic (cancel before calling Detection, and provide enough disks so one result is produced before the limit). * use working directory * fix compilation * fix compilation * rename * go vet * fix getenv * address comments, fix error	3 weeks ago
Chris Lu	7f6e58b791	Fix SFTP file upload failures with JWT filer tokens (#8448 ) * Fix SFTP file upload failures with JWT filer tokens (issue #8425) When JWT authentication is enabled for filer operations via jwt.filer_signing.* configuration, SFTP server file upload requests were rejected because they lacked JWT authorization headers. Changes: - Added JWT signing key and expiration fields to SftpServer struct - Modified putFile() to generate and include JWT tokens in upload requests - Enhanced SFTPServiceOptions with JWT configuration fields - Updated SFTP command startup to load and pass JWT config to service This allows SFTP uploads to authenticate with JWT-enabled filers, consistent with how other SeaweedFS components (S3 API, file browser) handle filer auth. Fixes #8425 * Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	3 weeks ago
Anton	427c975ff3	fix(plugin/worker): make VacuumHandler report MaxExecutionConcurrency from worker startup flag (#8435 ) * fix(plugin/worker): make VacuumHandler report MaxExecutionConcurrency from worker startup flag Previously, MaxExecutionConcurrency was hardcoded to 2 in VacuumHandler.Capability(). The scheduler's schedulerWorkerExecutionLimit() takes the minimum of the UI-configured PerWorkerExecutionConcurrency and the worker-reported capability limit, so the hardcoded value silently capped each worker to 2 concurrent vacuum executions regardless of the --max-execute flag passed at worker startup. Pass maxExecutionConcurrency into NewVacuumHandler() and wire it through buildPluginWorkerHandler/buildPluginWorkerHandlers so the capability reflects the actual worker configuration. The default falls back to 2 when the value is unset or zero. * Update weed/command/worker_runtime.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Anton Ustyugov <anton@devops> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	4 weeks ago
Chris Lu	8d59ef41d5	Admin UI: replace gin with mux (#8420 ) * Replace admin gin router with mux * Update layout_templ.go * Harden admin handlers * Add login CSRF handling * Fix filer copy naming conflict * address comments * address comments	4 weeks ago
Chris Lu	e596542295	Move SQL engine and PostgreSQL server to their own binaries (#8417 ) * Drop SQL engine and PostgreSQL server * Split SQL tooling into weed-db and weed-sql * move * fix building	4 weeks ago
Chris Lu	e4b70c2521	go fix	4 weeks ago
Chris Lu	8ec9ff4a12	Refactor plugin system and migrate worker runtime (#8369 ) * admin: add plugin runtime UI page and route wiring * pb: add plugin gRPC contract and generated bindings * admin/plugin: implement worker registry, runtime, monitoring, and config store * admin/dash: wire plugin runtime and expose plugin workflow APIs * command: add flags to enable plugin runtime * admin: rename remaining plugin v2 wording to plugin * admin/plugin: add detectable job type registry helper * admin/plugin: add scheduled detection and dispatch orchestration * admin/plugin: prefetch job type descriptors when workers connect * admin/plugin: add known job type discovery API and UI * admin/plugin: refresh design doc to match current implementation * admin/plugin: enforce per-worker scheduler concurrency limits * admin/plugin: use descriptor runtime defaults for scheduler policy * admin/ui: auto-load first known plugin job type on page open * admin/plugin: bootstrap persisted config from descriptor defaults * admin/plugin: dedupe scheduled proposals by dedupe key * admin/ui: add job type and state filters for plugin monitoring * admin/ui: add per-job-type plugin activity summary * admin/plugin: split descriptor read API from schema refresh * admin/ui: keep plugin summary metrics global while tables are filtered * admin/plugin: retry executor reservation before timing out * admin/plugin: expose scheduler states for monitoring * admin/ui: show per-job-type scheduler states in plugin monitor * pb/plugin: rename protobuf package to plugin * admin/plugin: rename pluginRuntime wiring to plugin * admin/plugin: remove runtime naming from plugin APIs and UI * admin/plugin: rename runtime files to plugin naming * admin/plugin: persist jobs and activities for monitor recovery * admin/plugin: lease one detector worker per job type * admin/ui: show worker load from plugin heartbeats * admin/plugin: skip stale workers for detector and executor picks * plugin/worker: add plugin worker command and stream runtime scaffold * plugin/worker: implement vacuum detect and execute handlers * admin/plugin: document external vacuum plugin worker starter * command: update plugin.worker help to reflect implemented flow * command/admin: drop legacy Plugin V2 label * plugin/worker: validate vacuum job type and respect min interval * plugin/worker: test no-op detect when min interval not elapsed * command/admin: document plugin.worker external process * plugin/worker: advertise configured concurrency in hello * command/plugin.worker: add jobType handler selection * command/plugin.worker: test handler selection by job type * command/plugin.worker: persist worker id in workingDir * admin/plugin: document plugin.worker jobType and workingDir flags * plugin/worker: support cancel request for in-flight work * plugin/worker: test cancel request acknowledgements * command/plugin.worker: document workingDir and jobType behavior * plugin/worker: emit executor activity events for monitor * plugin/worker: test executor activity builder * admin/plugin: send last successful run in detection request * admin/plugin: send cancel request when detect or execute context ends * admin/plugin: document worker cancel request responsibility * admin/handlers: expose plugin scheduler states API in no-auth mode * admin/handlers: test plugin scheduler states route registration * admin/plugin: keep worker id on worker-generated activity records * admin/plugin: test worker id propagation in monitor activities * admin/dash: always initialize plugin service * command/admin: remove plugin enable flags and default to enabled * admin/dash: drop pluginEnabled constructor parameter * admin/plugin UI: stop checking plugin enabled state * admin/plugin: remove docs for plugin enable flags * admin/dash: remove unused plugin enabled check method * admin/dash: fallback to in-memory plugin init when dataDir fails * admin/plugin API: expose worker gRPC port in status * command/plugin.worker: resolve admin gRPC port via plugin status * split plugin UI into overview/configuration/monitoring pages * Update layout_templ.go * add volume_balance plugin worker handler * wire plugin.worker CLI for volume_balance job type * add erasure_coding plugin worker handler * wire plugin.worker CLI for erasure_coding job type * support multi-job handlers in plugin worker runtime * allow plugin.worker jobType as comma-separated list * admin/plugin UI: rename to Workers and simplify config view * plugin worker: queue detection requests instead of capacity reject * Update plugin_worker.go * plugin volume_balance: remove force_move/timeout from worker config UI * plugin erasure_coding: enforce local working dir and cleanup * admin/plugin UI: rename admin settings to job scheduling * admin/plugin UI: persist and robustly render detection results * admin/plugin: record and return detection trace metadata * admin/plugin UI: show detection process and decision trace * plugin: surface detector decision trace as activities * mini: start a plugin worker by default * admin/plugin UI: split monitoring into detection and execution tabs * plugin worker: emit detection decision trace for EC and balance * admin workers UI: split monitoring into detection and execution pages * plugin scheduler: skip proposals for active assigned/running jobs * admin workers UI: add job queue tab * plugin worker: add dummy stress detector and executor job type * admin workers UI: reorder tabs to detection queue execution * admin workers UI: regenerate plugin template * plugin defaults: include dummy stress and add stress tests * plugin dummy stress: rotate detection selections across runs * plugin scheduler: remove cross-run proposal dedupe * plugin queue: track pending scheduled jobs * plugin scheduler: wait for executor capacity before dispatch * plugin scheduler: skip detection when waiting backlog is high * plugin: add disk-backed job detail API and persistence * admin ui: show plugin job detail modal from job id links * plugin: generate unique job ids instead of reusing proposal ids * plugin worker: emit heartbeats on work state changes * plugin registry: round-robin tied executor and detector picks * add temporary EC overnight stress runner * plugin job details: persist and render EC execution plans * ec volume details: color data and parity shard badges * shard labels: keep parity ids numeric and color-only distinction * admin: remove legacy maintenance UI routes and templates * admin: remove dead maintenance endpoint helpers * Update layout_templ.go * remove dummy_stress worker and command support * refactor plugin UI to job-type top tabs and sub-tabs * migrate weed worker command to plugin runtime * remove plugin.worker command and keep worker runtime with metrics * update helm worker args for jobType and execution flags * set plugin scheduling defaults to global 16 and per-worker 4 * stress: fix RPC context reuse and remove redundant variables in ec_stress_runner * admin/plugin: fix lifecycle races, safe channel operations, and terminal state constants * admin/dash: randomize job IDs and fix priority zero-value overwrite in plugin API * admin/handlers: implement buffered rendering to prevent response corruption * admin/plugin: implement debounced persistence flusher and optimize BuildJobDetail memory lookups * admin/plugin: fix priority overwrite and implement bounded wait in scheduler reserve * admin/plugin: implement atomic file writes and fix run record side effects * admin/plugin: use P prefix for parity shard labels in execution plans * admin/plugin: enable parallel execution for cancellation tests * admin: refactor time.Time fields to pointers for better JSON omitempty support * admin/plugin: implement pointer-safe time assignments and comparisons in plugin core * admin/plugin: fix time assignment and sorting logic in plugin monitor after pointer refactor * admin/plugin: update scheduler activity tracking to use time pointers * admin/plugin: fix time-based run history trimming after pointer refactor * admin/dash: fix JobSpec struct literal in plugin API after pointer refactor * admin/view: add D/P prefixes to EC shard badges for UI consistency * admin/plugin: use lifecycle-aware context for schema prefetching * Update ec_volume_details_templ.go * admin/stress: fix proposal sorting and log volume cleanup errors * stress: refine ec stress runner with math/rand and collection name - Added Collection field to VolumeEcShardsDeleteRequest for correct filename construction. - Replaced crypto/rand with seeded math/rand PRNG for bulk payloads. - Added documentation for EcMinAge zero-value behavior. - Added logging for ignored errors in volume/shard deletion. * admin: return internal server error for plugin store failures Changed error status code from 400 Bad Request to 500 Internal Server Error for failures in GetPluginJobDetail to correctly reflect server-side errors. * admin: implement safe channel sends and graceful shutdown sync - Added sync.WaitGroup to Plugin struct to manage background goroutines. - Implemented safeSendCh helper using recover() to prevent panics on closed channels. - Ensured Shutdown() waits for all background operations to complete. * admin: robustify plugin monitor with nil-safe time and record init - Standardized nil-safe assignment for time.Time pointers (CreatedAt, UpdatedAt, CompletedAt). - Ensured persistJobDetailSnapshot initializes new records correctly if they don't exist on disk. - Fixed debounced persistence to trigger immediate write on job completion. admin: improve scheduler shutdown behavior and logic guards - Replaced brittle error string matching with explicit r.shutdownCh selection for shutdown detection. - Removed redundant nil guard in buildScheduledJobSpec. - Standardized WaitGroup usage for schedulerLoop. * admin: implement deep copy for job parameters and atomic write fixes - Implemented deepCopyGenericValue and used it in cloneTrackedJob to prevent shared state. - Ensured atomicWriteFile creates parent directories before writing. * admin: remove unreachable branch in shard classification Removed an unreachable 'totalShards <= 0' check in classifyShardID as dataShards and parityShards are already guarded. * admin: secure UI links and use canonical shard constants - Added rel="noopener noreferrer" to external links for security. - Replaced magic number 14 with erasure_coding.TotalShardsCount. - Used renderEcShardBadge for missing shard list consistency. * admin: stabilize plugin tests and fix regressions - Composed a robust plugin_monitor_test.go to handle asynchronous persistence. - Updated all time.Time literals to use timeToPtr helper. - Added explicit Shutdown() calls in tests to synchronize with debounced writes. - Fixed syntax errors and orphaned struct literals in tests. * Potential fix for code scanning alert no. 278: Slice memory allocation with excessive size value Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Potential fix for code scanning alert no. 283: Uncontrolled data used in path expression Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * admin: finalize refinements for error handling, scheduler, and race fixes - Standardized HTTP 500 status codes for store failures in plugin_api.go. - Tracked scheduled detection goroutines with sync.WaitGroup for safe shutdown. - Fixed race condition in safeSendDetectionComplete by extracting channel under lock. - Implemented deep copy for JobActivity details. - Used defaultDirPerm constant in atomicWriteFile. * test(ec): migrate admin dockertest to plugin APIs * admin/plugin_api: fix RunPluginJobTypeAPI to return 500 for server-side detection/filter errors * admin/plugin_api: fix ExecutePluginJobAPI to return 500 for job execution failures * admin/plugin_api: limit parseProtoJSONBody request body to 1MB to prevent unbounded memory usage * admin/plugin: consolidate regex to package-level validJobTypePattern; add char validation to sanitizeJobID * admin/plugin: fix racy Shutdown channel close with sync.Once * admin/plugin: track sendLoop and recv goroutines in WorkerStream with r.wg * admin/plugin: document writeProtoFiles atomicity — .pb is source of truth, .json is human-readable only * admin/plugin: extract activityLess helper to deduplicate nil-safe OccurredAt sort comparators * test/ec: check http.NewRequest errors to prevent nil req panics * test/ec: replace deprecated ioutil/math/rand, fix stale step comment 5.1→3.1 * plugin(ec): raise default detection and scheduling throughput limits * topology: include empty disks in volume list and EC capacity fallback * topology: remove hard 10-task cap for detection planning * Update ec_volume_details_templ.go * adjust default * fix tests --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>	1 month ago
Chris Lu	3300874cb5	filer: add default log purging to master maintenance scripts (#8359 ) * filer: add default log purging to master maintenance scripts * filer: fix default maintenance scripts to include full set of tasks * filer: refactor maintenance scripts to avoid duplication	1 month ago
Chris Lu	0d8588e3ae	S3: Implement IAM defaults and STS signing key fallback (#8348 ) * S3: Implement IAM defaults and STS signing key fallback logic * S3: Refactor startup order to init SSE-S3 key manager before IAM * S3: Derive STS signing key from KEK using HKDF for security isolation * S3: Document STS signing key fallback in security.toml * fix(s3api): refine anonymous access logic and secure-by-default behavior - Initialize anonymous identity by default in `NewIdentityAccessManagement` to prevent nil pointer exceptions. - Ensure `ReplaceS3ApiConfiguration` preserves the anonymous identity if not present in the new configuration. - Update `NewIdentityAccessManagement` signature to accept `filerClient`. - In legacy mode (no policy engine), anonymous defaults to Deny (no actions), preserving secure-by-default behavior. - Use specific `LookupAnonymous` method instead of generic map lookup. - Update tests to accommodate signature changes and verify improved anonymous handling. * feat(s3api): make IAM configuration optional - Start S3 API server without a configuration file if `EnableIam` option is set. - Default to `Allow` effect for policy engine when no configuration is provided (Zero-Config mode). - Handle empty configuration path gracefully in `loadIAMManagerFromConfig`. - Add integration test `iam_optional_test.go` to verify empty config behavior. * fix(iamapi): fix signature mismatch in NewIdentityAccessManagementWithStore * fix(iamapi): properly initialize FilerClient instead of passing nil * fix(iamapi): properly initialize filer client for IAM management - Instead of passing `nil`, construct a `wdclient.FilerClient` using the provided `Filers` addresses. - Ensure `NewIdentityAccessManagementWithStore` receives a valid `filerClient` to avoid potential nil pointer dereferences or limited functionality. * clean: remove dead code in s3api_server.go * refactor(s3api): improve IAM initialization, safety and anonymous access security * fix(s3api): ensure IAM config loads from filer after client init * fix(s3): resolve test failures in integration, CORS, and tagging tests - Fix CORS tests by providing explicit anonymous permissions config - Fix S3 integration tests by setting admin credentials in init - Align tagging test credentials in CI with IAM defaults - Added goroutine to retry IAM config load in iamapi server * fix(s3): allow anonymous access to health targets and S3 Tables when identities are present * fix(ci): use /healthz for Caddy health check in awscli tests * iam, s3api: expose DefaultAllow from IAM and Policy Engine This allows checking the global "Open by Default" configuration from other components like S3 Tables. * s3api/s3tables: support DefaultAllow in permission logic and handler Updated CheckPermissionWithContext to respect the DefaultAllow flag in PolicyContext. This enables "Open by Default" behavior for unauthenticated access in zero-config environments. Added a targeted unit test to verify the logic. * s3api/s3tables: propagate DefaultAllow through handlers Propagated the DefaultAllow flag to individual handlers for namespaces, buckets, tables, policies, and tagging. This ensures consistent "Open by Default" behavior across all S3 Tables API endpoints. * s3api: wire up DefaultAllow for S3 Tables API initialization Updated registerS3TablesRoutes to query the global IAM configuration and set the DefaultAllow flag on the S3 Tables API server. This completes the end-to-end propagation required for anonymous access in zero-config environments. Added a SetDefaultAllow method to S3TablesApiServer to facilitate this. * s3api: fix tests by adding DefaultAllow to mock IAM integrations The IAMIntegration interface was updated to include DefaultAllow(), breaking several mock implementations in tests. This commit fixes the build errors by adding the missing method to the mocks. * env * ensure ports * env * env * fix default allow * add one more test using non-anonymous user * debug * add more debug * less logs	1 month ago
Lisandro Pin	0721e3c1e9	Rework volume compaction (a.k.a vacuuming) logic to cleanly support new parameters. (#8337 ) We'll leverage on this to support a "ignore broken needles" option, necessary to properly recover damaged volumes, as described in https://github.com/seaweedfs/seaweedfs/issues/7442#issuecomment-3897784283 .	1 month ago
Chris Lu	b57429ef2e	Switch empty-folder cleanup to bucket policy (#8292 ) * Fix Spark _temporary cleanup and add issue #8285 regression test * Generalize empty folder cleanup for Spark temp artifacts * Revert synchronous folder pruning and add cleanup diagnostics * Add actionable empty-folder cleanup diagnostics * Fix Spark temp marker cleanup in async folder cleaner * Fix Spark temp cleanup with implicit directory markers * Keep explicit directory markers non-implicit * logging * more logs * Switch empty-folder cleanup to bucket policy * Seaweed-X-Amz-Allow-Empty-Folders * less logs * go vet * less logs * refactoring	1 month ago
Chris Lu	ba8e2aaae9	Fix master leader election when grpc ports change (#8272 ) * Fix master leader detection when grpc ports change * Canonicalize self peer entry to avoid raft self-alias panic * Normalize and deduplicate master peer addresses	1 month ago
Chris Lu	15d0a46679	Normalize and deduplicate master peer addresses	1 month ago
Chris Lu	ae27e17e6f	Canonicalize self peer entry to avoid raft self-alias panic	1 month ago
Chris Lu	02dac23119	Fix master leader detection when grpc ports change	1 month ago
Chris Lu	e6ee293c17	Add table operations test (#8241 ) * Add Trino blog operations test * Update test/s3tables/catalog_trino/trino_blog_operations_test.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * feat: add table bucket path helpers and filer operations - Add table object root and table location mapping directories - Implement ensureDirectory, upsertFile, deleteEntryIfExists helpers - Support table location bucket mapping for S3 access * feat: manage table bucket object roots on creation/deletion - Create .objects directory for table buckets on creation - Clean up table object bucket paths on deletion - Enable S3 operations on table bucket object roots * feat: add table location mapping for Iceberg REST - Track table location bucket mappings when tables are created/updated/deleted - Enable location-based routing for S3 operations on table data * feat: route S3 operations to table bucket object roots - Route table-s3 bucket names to mapped table paths - Route table buckets to object root directories - Support table location bucket mapping lookup * feat: emit table-s3 locations from Iceberg REST - Generate unique table-s3 bucket names with UUID suffix - Store table metadata under table bucket paths - Return table-s3 locations for Trino compatibility * fix: handle missing directories in S3 list operations - Propagate ErrNotFound from ListEntries for non-existent directories - Treat missing directories as empty results for list operations - Fixes Trino non-empty location checks on table creation * test: improve Trino CSV parsing for single-value results - Sanitize Trino output to skip jline warnings - Handle single-value CSV results without header rows - Strip quotes from numeric values in tests * refactor: use bucket path helpers throughout S3 API - Replace direct bucket path operations with helper functions - Leverage centralized table bucket routing logic - Improve maintainability with consistent path resolution * fix: add table bucket cache and improve filer error handling - Cache table bucket lookups to reduce filer overhead on repeated checks - Use filer_pb.CreateEntry and filer_pb.UpdateEntry helpers to check resp.Error - Fix delete order in handler_bucket_get_list_delete: delete table object before directory - Make location mapping errors best-effort: log and continue, don't fail API - Update table location mappings to delete stale prior bucket mappings on update - Add 1-second sleep before timestamp time travel query to ensure timestamps are in past - Fix CSV parsing: examine all lines, not skip first; handle single-value rows * fix: properly handle stale metadata location mapping cleanup - Capture oldMetadataLocation before mutation in handleUpdateTable - Update updateTableLocationMapping to accept both old and new locations - Use passed-in oldMetadataLocation to detect location changes - Delete stale mapping only when location actually changes - Pass empty string for oldLocation in handleCreateTable (new tables have no prior mapping) - Improve logging to show old -> new location transitions * refactor: cleanup imports and cache design - Remove unused 'sync' import from bucket_paths.go - Use filer_pb.UpdateEntry helper in setExtendedAttribute and deleteExtendedAttribute for consistent error handling - Add dedicated tableBucketCache map[string]bool to BucketRegistry instead of mixing concerns with metadataCache - Improve cache separation: table buckets cache is now separate from bucket metadata cache * fix: improve cache invalidation and add transient error handling Cache invalidation (critical fix): - Add tableLocationCache to BucketRegistry for location mapping lookups - Clear tableBucketCache and tableLocationCache in RemoveBucketMetadata - Prevents stale cache entries when buckets are deleted/recreated Transient error handling: - Only cache table bucket lookups when conclusive (found or ErrNotFound) - Skip caching on transient errors (network, permission, etc) - Prevents marking real table buckets as non-table due to transient failures Performance optimization: - Cache tableLocationDir results to avoid repeated filer RPCs on hot paths - tableLocationDir now checks cache before making expensive filer lookups - Cache stores empty string for 'not found' to avoid redundant lookups Code clarity: - Add comment to deleteDirectory explaining DeleteEntry response lacks Error field * go fmt * fix: mirror transient error handling in tableLocationDir and optimize bucketDir Transient error handling: - tableLocationDir now only caches definitive results - Mirrors isTableBucket behavior to prevent treating transient errors as permanent misses - Improves reliability on flaky systems or during recovery Performance optimization: - bucketDir avoids redundant isTableBucket call via bucketRoot - Directly use s3a.option.BucketsPath for regular buckets - Saves one cache lookup for every non-table bucket operation * fix: revert bucketDir optimization to preserve bucketRoot logic The optimization to directly use BucketsPath bypassed bucketRoot's logic and caused issues with S3 list operations on delimiter+prefix cases. Revert to using path.Join(s3a.bucketRoot(bucket), bucket) which properly handles all bucket types and ensures consistent path resolution across the codebase. The slight performance cost of an extra cache lookup is worth the correctness and consistency benefits. * feat: move table buckets under /buckets Add a table-bucket marker attribute, reuse bucket metadata cache for table bucket detection, and update list/validation/UI/test paths to treat table buckets as /buckets entries. * Fix S3 Tables code review issues - handler_bucket_create.go: Fix bucket existence check to properly validate entryResp.Entry before setting s3BucketExists flag (nil Entry should not indicate existing bucket) - bucket_paths.go: Add clarifying comment to bucketRoot() explaining unified buckets root path for all bucket types - file_browser_data.go: Optimize by extracting table bucket check early to avoid redundant WithFilerClient call * Fix list prefix delimiter handling * Handle list errors conservatively * Fix Trino FOR TIMESTAMP query - use past timestamp Iceberg requires the timestamp to be strictly in the past. Use current_timestamp - interval '1' second instead of current_timestamp. --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	1 month ago
Chris Lu	a3b83f8808	test: add Trino Iceberg catalog integration test (#8228 ) * test: add Trino Iceberg catalog integration test - Create test/s3/catalog_trino/trino_catalog_test.go with TestTrinoIcebergCatalog - Tests integration between Trino SQL engine and SeaweedFS Iceberg REST catalog - Starts weed mini with all services and Trino in Docker container - Validates Iceberg catalog schema creation and listing operations - Uses native S3 filesystem support in Trino with path-style access - Add workflow job to s3-tables-tests.yml for CI execution * fix: preserve AWS environment credentials when replacing S3 configuration When S3 configuration is loaded from filer/db, it replaces the identities list and inadvertently removes AWS_ACCESS_KEY_ID credentials that were added from environment variables. This caused auth to remain disabled even though valid credentials were present. Fix by preserving environment-based identities when replacing the configuration and re-adding them after the replacement. This ensures environment credentials persist across configuration reloads and properly enable authentication. * fix: use correct ServerAddress format with gRPC port encoding The admin server couldn't connect to master because the master address was missing the gRPC port information. Use pb.NewServerAddress() which properly encodes both HTTP and gRPC ports in the address string. Changes: - weed/command/mini.go: Use pb.NewServerAddress for master address in admin - test/s3/policy/policy_test.go: Store and use gRPC ports for master/filer addresses This fix applies to: 1. Admin server connection to master (mini.go) 2. Test shell commands that need master/filer addresses (policy_test.go) * move * move * fix: always include gRPC port in server address encoding The NewServerAddress() function was omitting the gRPC port from the address string when it matched the port+10000 convention. However, gRPC port allocation doesn't always follow this convention - when the calculated port is busy, an alternative port is allocated. This caused a bug where: 1. Master's gRPC port was allocated as 50661 (sequential, not port+10000) 2. Address was encoded as '192.168.1.66:50660' (gRPC port omitted) 3. Admin client called ToGrpcAddress() which assumed port+10000 offset 4. Admin tried to connect to 60660 but master was on 50661 → connection failed Fix: Always include explicit gRPC port in address format (host:httpPort.grpcPort) unless gRPC port is 0. This makes addresses unambiguous and works regardless of the port allocation strategy used. Impacts: All server-to-server gRPC connections now use properly formatted addresses. * test: fix Iceberg REST API readiness check The Iceberg REST API endpoints require authentication. When checked without credentials, the API returns 403 Forbidden (not 401 Unauthorized). The readiness check now accepts both auth error codes (401/403) as indicators that the service is up and ready, it just needs credentials. This fixes the 'Iceberg REST API did not become ready' test failure. * Fix AWS SigV4 signature verification for base64-encoded payload hashes AWS SigV4 canonical requests must use hex-encoded SHA256 hashes, but the X-Amz-Content-Sha256 header may be transmitted as base64. Changes: - Added normalizePayloadHash() function to convert base64 to hex - Call normalizePayloadHash() in extractV4AuthInfoFromHeader() - Added encoding/base64 import Fixes 403 Forbidden errors on POST requests to Iceberg REST API when clients send base64-encoded content hashes in the header. Impacted services: Iceberg REST API, S3Tables * Fix AWS SigV4 signature verification for base64-encoded payload hashes AWS SigV4 canonical requests must use hex-encoded SHA256 hashes, but the X-Amz-Content-Sha256 header may be transmitted as base64. Changes: - Added normalizePayloadHash() function to convert base64 to hex - Call normalizePayloadHash() in extractV4AuthInfoFromHeader() - Added encoding/base64 import - Removed unused fmt import Fixes 403 Forbidden errors on POST requests to Iceberg REST API when clients send base64-encoded content hashes in the header. Impacted services: Iceberg REST API, S3Tables * pass sigv4 * s3api: fix identity preservation and logging levels - Ensure environment-based identities are preserved during config replacement - Update accessKeyIdent and nameToIdentity maps correctly - Downgrade informational logs to V(2) to reduce noise * test: fix trino integration test and s3 policy test - Pin Trino image version to 479 - Fix port binding to 0.0.0.0 for Docker connectivity - Fix S3 policy test hang by correctly assigning MiniClusterCtx - Improve port finding robustness in policy tests * ci: pre-pull trino image to avoid timeouts - Pull trinodb/trino:479 after Docker setup - Ensure image is ready before integration tests start * iceberg: remove unused checkAuth and improve logging - Remove unused checkAuth method - Downgrade informational logs to V(2) - Ensure loggingMiddleware uses a status writer for accurate reported codes - Narrow catch-all route to avoid interfering with other subsystems * iceberg: fix build failure by removing unused s3api import * Update iceberg.go * use warehouse * Update trino_catalog_test.go	1 month ago
Chris Lu	a04e8dd00b	Support Linux file/dir ACL in weed mount (#8233 ) * Support Linux file/dir ACL in weed mount #8229 * Update weed/command/mount_std.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	1 month ago
Chris Lu	e39a4c2041	fix flaky test	1 month ago
Chris Lu	bd4e7ff14e	command: fix s3 panic in filer command (#8208 ) command: fix s3 panic in filer command due to uninitialized options This fixes a nil pointer dereference panic when starting the S3 server via the `weed filer` command by correctly initializing `iamReadOnly` and `portIceberg` flags. Relates to #8200	1 month ago
Chris Lu	72a8f598f2	Fix Maintenance Task Sorting and Refactor Log Persistence (#8199 ) * fix float stepping * do not auto refresh * only logs when non 200 status * fix maintenance task sorting and cleanup redundant handler logic * Refactor log retrieval to persist to disk and fix slowness - Move log retrieval to disk-based persistence in GetMaintenanceTaskDetail - Implement background log fetching on task completion in worker_grpc_server.go - Implement async background refresh for in-progress tasks - Completely remove blocking gRPC calls from the UI path to fix 10s timeouts - Cleanup debug logs and performance profiling code * Ensure consistent deterministic sorting in config_persistence cleanup * Replace magic numbers with constants and remove debug logs - Added descriptive constants for truncation limits and timeouts in admin_server.go and worker_grpc_server.go - Replaced magic numbers with these constants throughout the codebase - Verified removal of stdout debug printing - Ensured consistent truncation logic during log persistence * Address code review feedback on history truncation and logging logic - Fix AssignmentHistory double-serialization by copying task in GetMaintenanceTaskDetail - Fix handleTaskCompletion logging logic (mutually exclusive success/failure logs) - Remove unused Timeout field from LogRequestContext and sync select timeouts with constants - Ensure AssignmentHistory is only provided in the top-level field for better JSON structure * Implement goroutine leak protection and request deduplication - Add request deduplication in RequestTaskLogs to prevent multiple concurrent fetches for the same task - Implement safe cleanup in timeout handlers to avoid race conditions in pendingLogRequests map - Add a 10s cooldown for background log refreshes in GetMaintenanceTaskDetail to prevent spamming - Ensure all persistent log-fetching goroutines are bounded and efficiently managed * Fix potential nil pointer panics in maintenance handlers - Add nil checks for adminServer in ShowTaskDetail, ShowMaintenanceWorkers, and UpdateTaskConfig - Update getMaintenanceQueueData to return a descriptive error instead of nil when adminServer is uninitialized - Ensure internal helper methods consistently check for adminServer initialization before use * Strictly enforce disk-only log reading - Remove background log fetching from GetMaintenanceTaskDetail to prevent timeouts and network calls during page view - Remove unused lastLogFetch tracking fields to clean up dead code - Ensure logs are only updated upon task completion via handleTaskCompletion * Refactor GetWorkerLogs to read from disk - Update /api/maintenance/workers/:id/logs endpoint to use configPersistence.LoadTaskExecutionLogs - Remove synchronous gRPC call RequestTaskLogs to prevent timeouts and bad gateway errors - Ensure consistent log retrieval behavior across the application (disk-only) * Fix timestamp parsing in log viewer - Update task_detail.templ JS to handle both ISO 8601 strings and Unix timestamps - Fix "Invalid time value" error when displaying logs fetched from disk - Regenerate templates * master: fallback to HDD if SSD volumes are full in Assign * worker: improve EC detection logging and fix skip counters * worker: add Sync method to TaskLogger interface * worker: implement Sync and ensure logs are flushed before task completion * admin: improve task log retrieval with retries and better timeouts * admin: robust timestamp parsing in task detail view	1 month ago
Chris Lu	2ff1cd9fc9	format	2 months ago
Chris Lu	1274cf038c	s3: enforce authentication and JSON error format for Iceberg REST Catalog (#8192 ) * s3: enforce authentication and JSON error format for Iceberg REST Catalog * s3/iceberg: align error exception types with OpenAPI spec examples * s3api: refactor AuthenticateRequest to return identity object * s3/iceberg: propagate full identity object to request context * s3/iceberg: differentiate NotAuthorizedException and ForbiddenException * s3/iceberg: reject requests if authenticator is nil to prevent auth bypass * s3/iceberg: refactor Auth middleware to build context incrementally and use switch for error mapping * s3api: update misleading comment for authRequestWithAuthType * s3api: return ErrAccessDenied if IAM is not configured to prevent auth bypass * s3/iceberg: optimize context update in Auth middleware * s3api: export CanDo for external authorization use * s3/iceberg: enforce identity-based authorization in all API handlers * s3api: fix compilation errors by updating internal CanDo references * s3/iceberg: robust identity validation and consistent action usage in handlers * s3api: complete CanDo rename across tests and policy engine integration * s3api: fix integration tests by allowing admin access when auth is disabled and explicit gRPC ports * duckdb * create test bucket	2 months ago
Chris Lu	2bb21ea276	feat: Add Iceberg REST Catalog server and admin UI (#8175 ) * feat: Add Iceberg REST Catalog server Implement Iceberg REST Catalog API on a separate port (default 8181) that exposes S3 Tables metadata through the Apache Iceberg REST protocol. - Add new weed/s3api/iceberg package with REST handlers - Implement /v1/config endpoint returning catalog configuration - Implement namespace endpoints (list/create/get/head/delete) - Implement table endpoints (list/create/load/head/delete/update) - Add -port.iceberg flag to S3 standalone server (s3.go) - Add -s3.port.iceberg flag to combined server mode (server.go) - Add -s3.port.iceberg flag to mini cluster mode (mini.go) - Support prefix-based routing for multiple catalogs The Iceberg REST server reuses S3 Tables metadata storage under /table-buckets and enables DuckDB, Spark, and other Iceberg clients to connect to SeaweedFS as a catalog. * feat: Add Iceberg Catalog pages to admin UI Add admin UI pages to browse Iceberg catalogs, namespaces, and tables. - Add Iceberg Catalog menu item under Object Store navigation - Create iceberg_catalog.templ showing catalog overview with REST info - Create iceberg_namespaces.templ listing namespaces in a catalog - Create iceberg_tables.templ listing tables in a namespace - Add handlers and routes in admin_handlers.go - Add Iceberg data provider methods in s3tables_management.go - Add Iceberg data types in types.go The Iceberg Catalog pages provide visibility into the same S3 Tables data through an Iceberg-centric lens, including REST endpoint examples for DuckDB and PyIceberg. * test: Add Iceberg catalog integration tests and reorg s3tables tests - Reorganize existing s3tables tests to test/s3tables/table-buckets/ - Add new test/s3tables/catalog/ for Iceberg REST catalog tests - Add TestIcebergConfig to verify /v1/config endpoint - Add TestIcebergNamespaces to verify namespace listing - Add TestDuckDBIntegration for DuckDB connectivity (requires Docker) - Update CI workflow to use new test paths * fix: Generate proper random UUIDs for Iceberg tables Address code review feedback: - Replace placeholder UUID with crypto/rand-based UUID v4 generation - Add detailed TODO comments for handleUpdateTable stub explaining the required atomic metadata swap implementation * fix: Serve Iceberg on localhost listener when binding to different interface Address code review feedback: properly serve the localhost listener when the Iceberg server is bound to a non-localhost interface. * ci: Add Iceberg catalog integration tests to CI Add new job to run Iceberg catalog tests in CI, along with: - Iceberg package build verification - Iceberg unit tests - Iceberg go vet checks - Iceberg format checks * fix: Address code review feedback for Iceberg implementation - fix: Replace hardcoded account ID with s3_constants.AccountAdminId in buildTableBucketARN() - fix: Improve UUID generation error handling with deterministic fallback (timestamp + PID + counter) - fix: Update handleUpdateTable to return HTTP 501 Not Implemented instead of fake success - fix: Better error handling in handleNamespaceExists to distinguish 404 from 500 errors - fix: Use relative URL in template instead of hardcoded localhost:8181 - fix: Add HTTP timeout to test's waitForService function to avoid hangs - fix: Use dynamic ephemeral ports in integration tests to avoid flaky parallel failures - fix: Add Iceberg port to final port configuration logging in mini.go * fix: Address critical issues in Iceberg implementation - fix: Cache table UUIDs to ensure persistence across LoadTable calls The UUID now remains stable for the lifetime of the server session. TODO: For production, UUIDs should be persisted in S3 Tables metadata. - fix: Remove redundant URL-encoded namespace parsing mux router already decodes %1F to \x1F before passing to handlers. Redundant ReplaceAll call could cause bugs with literal %1F in namespace. * fix: Improve test robustness and reduce code duplication - fix: Make DuckDB test more robust by failing on unexpected errors Instead of silently logging errors, now explicitly check for expected conditions (extension not available) and skip the test appropriately. - fix: Extract username helper method to reduce duplication Created getUsername() helper in AdminHandlers to avoid duplicating the username retrieval logic across Iceberg page handlers. * fix: Add mutex protection to table UUID cache Protects concurrent access to the tableUUIDs map with sync.RWMutex. Uses read-lock for fast path when UUID already cached, and write-lock for generating new UUIDs. Includes double-check pattern to handle race condition between read-unlock and write-lock. * style: fix go fmt errors * feat(iceberg): persist table UUID in S3 Tables metadata * feat(admin): configure Iceberg port in Admin UI and commands * refactor: address review comments (flags, tests, handlers) - command/mini: fix tracking of explicit s3.port.iceberg flag - command/admin: add explicit -iceberg.port flag - admin/handlers: reuse getUsername helper - tests: use 127.0.0.1 for ephemeral ports and os.Stat for file size check * test: check error from FileStat in verify_gc_empty_test	2 months ago
Chris Lu	2ee6e4f391	mount: refresh and evict hot dir cache (#8174 ) * mount: refresh and evict hot dir cache * mount: guard dir update window and extend TTL * mount: reuse timestamp for cache mark * Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * mount: make dir cache tuning configurable * mount: dedupe dir update notices * mount: restore invalidate-all cache helper * mount: keep hot dir tuning constants * mount: centralize cache state reset * mount: mark refresh completion time * mount: allow disabling idle eviction --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2 months ago
Chris Lu	c106532b79	fix: prevent MiniClusterCtx race conditions in command shutdown Capture global MiniClusterCtx into local variables before goroutine/select evaluation to prevent nil dereference/data race when context is reset to nil after nil check. Applied to filer, master, volume, and s3 commands.	2 months ago
Chris Lu	2f155ee5ee	feat: Add S3 Tables support for Iceberg tabular data (#8147 ) * s3tables: extract utility and filer operations to separate modules - Move ARN parsing, path helpers, and metadata structures to utils.go - Extract all extended attribute and filer operations to filer_ops.go - Reduces code duplication and improves modularity - Improves code organization and maintainability * s3tables: split table bucket operations into focused modules - Create bucket_create.go for CreateTableBucket operation - Create bucket_get_list_delete.go for Get, List, Delete operations - Related operations grouped for better maintainability - Each file has a single, clear responsibility - Improves code clarity and makes it easier to test * s3tables: simplify handler by removing duplicate utilities - Reduce handler.go from 370 to 195 lines (47% reduction) - Remove duplicate ARN parsing and path helper functions - Remove filer operation methods moved to filer_ops.go - Remove metadata structure definitions moved to utils.go - Keep handler focused on request routing and response formatting - Maintains all functionality with improved code organization * s3tables: complete s3tables package implementation - namespace.go: namespace CRUD operations (310 lines) - table.go: table CRUD operations with Iceberg schema support (409 lines) - policy.go: resource policies and tagging operations (419 lines) - types.go: request/response types and error definitions (290 lines) - All handlers updated to use standalone utilities from utils.go - All files follow single responsibility principle * s3api: add S3 Tables integration layer - Create s3api_tables.go to integrate S3 Tables with S3 API server - Implement S3 Tables route matcher for X-Amz-Target header - Register S3 Tables routes with API router - Provide gRPC filer client interface for S3 Tables handlers - All S3 Tables operations accessible via S3 API endpoint * s3api: register S3 Tables routes in API server - Add S3 Tables route registration in s3api_server.go registerRouter method - Enable S3 Tables API operations to be routed through S3 API server - Routes handled by s3api_tables.go integration layer - Minimal changes to existing S3 API structure * test: add S3 Tables test infrastructure - Create setup.go with TestCluster and S3TablesClient definitions - Create client.go with HTTP client methods for all operations - Test utilities and client methods organized for reusability - Foundation for S3 Tables integration tests * test: add S3 Tables integration tests - Comprehensive integration tests for all 23 S3 Tables operations - Test cluster setup based on existing S3 integration tests - Tests cover: * Table bucket lifecycle (create, get, list, delete) * Namespace operations * Table CRUD with Iceberg schema * Table bucket and table policies * Resource tagging operations - Ready for CI/CD pipeline integration * ci: add S3 Tables integration tests to GitHub Actions - Create new workflow for S3 Tables integration testing - Add build verification job for s3tables package and s3api integration - Add format checking for S3 Tables code - Add go vet checks for code quality - Workflow runs on all pull requests - Includes test output logging and artifact upload on failure * s3tables: add handler_ prefix to operation handler files - Rename bucket_create.go → handler_bucket_create.go - Rename bucket_get_list_delete.go → handler_bucket_get_list_delete.go - Rename namespace.go → handler_namespace.go - Rename table.go → handler_table.go - Rename policy.go → handler_policy.go Improves file organization by clearly identifying handler implementations. No code changes, refactoring only. * s3tables test: refactor to eliminate duplicate definitions - Move all client methods to client.go - Remove duplicate types/constants from s3tables_integration_test.go - Keep setup.go for test infrastructure - Keep integration test logic in s3tables_integration_test.go - Clean up unused imports - Test compiles successfully * Delete client_methods.go * s3tables: add bucket name validation and fix error handling - Add isValidBucketName validation function for [a-z0-9_-] characters - Validate bucket name characters match ARN parsing regex - Fix error handling in WithFilerClient closure - properly check for lookup errors - Add error handling for json.Marshal calls (metadata and tags) - Improve error messages and logging * s3tables: add error handling for json.Marshal calls - Add error handling in handler_namespace.go (metadata marshaling) - Add error handling in handler_table.go (metadata and tags marshaling) - Add error handling in handler_policy.go (tag marshaling in TagResource and UntagResource) - Return proper errors with context instead of silently ignoring failures * s3tables: replace custom splitPath with stdlib functions - Remove custom splitPath implementation (23 lines) - Use filepath.Dir and filepath.Base from stdlib - More robust and handles edge cases correctly - Reduces code duplication * s3tables: improve error handling specificity in ListTableBuckets - Specifically check for 'not found' errors instead of catching all errors - Return empty list only when directory doesn't exist - Propagate other errors (network, permission) with context - Prevents masking real errors * s3api_tables: optimize action validation with map lookup - Replace O(n) slice iteration with O(1) map lookup - Move s3TablesActionsMap to package level - Avoid recreating the map on every function call - Improves performance for request validation * s3tables: implement permission checking and authorization - Add permissions.go with permission definitions and checks - Define permissions for all 21 S3 Tables operations - Add permission checking helper functions - Add getPrincipalFromRequest to extract caller identity - Implement access control in CreateTableBucket, GetTableBucket, DeleteTableBucket - Return 403 Forbidden for unauthorized operations - Only bucket owner can perform operations (extensible for future policies) - Add AuthError type for authorization failures * workflow: fix s3 tables tests path and working directory The workflow was failing because it was running inside 'weed' directory, but the tests are at the repository root. Removed working-directory default and updated relative paths to weed source. * workflow: remove emojis from echo statements * test: format s3tables client.go * workflow: fix go install path to ./weed * ci: fail s3 tables tests if any command in pipeline fails * s3tables: use path.Join for path construction and align namespace paths * s3tables: improve integration test stability and error reporting * s3tables: propagate request context to filer operations * s3tables: clean up unused code and improve error response formatting * Refine S3 Tables implementation to address code review feedback - Standardize namespace representation to []string - Improve listing logic with pagination and StartFromFileName - Enhance error handling with sentinel errors and robust checks - Add JSON encoding error logging - Fix CI workflow to use gofmt -l - Standardize timestamps in directory creation - Validate single-level namespaces * s3tables: further refinements to filer operations and utilities - Add multi-segment namespace support to ARN parsing - Refactor permission checking to use map lookup - Wrap lookup errors with ErrNotFound in filer operations - Standardize splitPath to use path package * test: improve S3 Tables client error handling and cleanup - Add detailed error reporting when decoding failure responses - Remove orphaned comments and unused sections * command: implement graceful shutdown for mini cluster - Introduce MiniClusterCtx to coordinate shutdown across mini services - Update Master, Volume, Filer, S3, and WebDAV servers to respect context cancellation - Ensure all resources are cleaned up properly during test teardown - Integrate MiniClusterCtx in s3tables integration tests * s3tables: fix pagination and enhance error handling in list/delete operations - Fix InclusiveStartFrom logic to ensure exclusive start on continued pages - Prevent duplicates in bucket, namespace, and table listings - Fail fast on listing errors during bucket and namespace deletion - Stop swallowing errors in handleListTables and return proper HTTP error responses * s3tables: align ARN formatting and optimize resource handling - Update generateTableARN to match AWS S3 Tables specification - Move defer r.Body.Close() to follow standard Go patterns - Remove unused generateNamespaceARN helper * command: fix stale error variable logging in filer serving goroutines - Use local 'err' variable instead of stale 'e' from outer scope - Applied to both TLS and non-TLS paths for local listener * s3tables: implement granular authorization and refine error responses - Remove mandatory ACTION_ADMIN at the router level - Enforce granular permissions in bucket and namespace handlers - Prioritize AccountID in ExtractPrincipalFromContext for ARN matching - Distinguish between 404 (NoSuchBucket) and 500 (InternalError) in metadata lookups - Clean up unused imports in s3api_tables.go * test: refactor S3 Tables client for DRYness and multi-segment namespaces - Implement doRequestAndDecode to eliminate HTTP boilerplate - Update client API to accept []string for namespaces to support hierarchy - Standardize error response decoding across all client methods * test: update integration tests to match refactored S3 Tables client - Pass namespaces as []string to support hierarchical structures - Adapt test calls to new client API signatures * s3tables: normalize filer errors and use standard helpers - Migrate from custom ErrNotFound to filer_pb.ErrNotFound - Use filer_pb.LookupEntry for automatic error normalization - Normalize entryExists and attribute lookups * s3tables: harden namespace validation and correct ARN parsing - Prohibit path traversal (".", "..") and "/" in namespaces - Restrict namespace characters to [a-z0-9_] for consistency - Switch to url.PathUnescape for correct decoding of ARN path components - Align ARN parsing regex with single-segment namespace validation * s3tables: improve robustness, security, and error propagation in handlers - Implement strict table name validation (prevention of path traversal and character enforcement) - Add nil checks for entry.Entry in all listing loops to prevent panics - Propagate backend errors instead of swallowing them or assuming 404 - Correctly map filer_pb.ErrNotFound to appropriate S3 error codes - Standardize existence checks across bucket, namespace, and table handlers * test: add miniClusterMutex to prevent race conditions - Introduce sync.Mutex to protect global state (os.Args, os.Chdir) - Ensure serialized initialization of the mini cluster runner - Fix intermittent race conditions during parallel test execution * s3tables: improve error handling and permission logic - Update handleGetNamespace to distinguish between 404 and 500 errors - Refactor CanManagePolicy to use CheckPermission for consistent enforcement - Ensure empty identities are correctly handled in policy management checks * s3tables: optimize regex usage and improve version token uniqueness - Pre-compile regex patterns as package-level variables to avoid re-compilation overhead on every call - Add a random component to version token generation to reduce collision probability under high concurrency * s3tables: harden auth and error handling - Add authorization checks to all S3 Tables handlers (policy, table ops) to enforce security - Improve error handling to distinguish between NotFound (404) and InternalError (500) - Fix directory FileMode usage in filer_ops - Improve test randomness for version tokens - Update permissions comments to acknowledge IAM gaps * S3 Tables: fix gRPC stream loop handling for list operations - Correctly handle io.EOF to terminate loops gracefully. - Propagate other errors to prevent silent failures. - Ensure all list results are processed effectively. * S3 Tables: validate ARN namespace to prevent path traversal - Enforce validation on decoded namespace in parseTableFromARN. - Ensures path components are safe after URL unescaping. * S3 Tables: secure API router with IAM authentication - Wrap S3 Tables handler with authenticateS3Tables. - Use AuthSignatureOnly to enforce valid credentials while delegating granular authorization to handlers. - Prevent anonymous access to all S3 Tables endpoints. * S3 Tables: fix gRPC stream loop handling in namespace handlers - Correctly handle io.EOF in handleListNamespaces and handleDeleteNamespace. - Propagate other errors to prevent silent failures or accidental data loss. - Added necessary io import. * S3 Tables: use os.ModeDir constant in filer_ops.go - Replace magic number 1<<31 with os.ModeDir for better readability. - Added necessary os import. * s3tables: improve principal extraction using identity context * s3tables: remove duplicate comment in permissions.go * s3tables test: improve error reporting on decoding failure * s3tables: implement validateTableName helper * s3tables: add table name validation and 404 propagation to policy handlers * s3tables: add table name validation and cleanup duplicated logic in table handlers * s3tables: ensure root tables directory exists before bucket creation * s3tables: implement token-based pagination for table buckets listing * s3tables: implement token-based pagination for namespace listing * s3tables: refine permission helpers to align with operation names * s3tables: return 404 in handleDeleteNamespace if namespace not found * s3tables: fix cross-namespace pagination in listTablesInAllNamespaces * s3tables test: expose pagination parameters in client list methods * s3tables test: update integration tests for new client API * s3tables: use crypto/rand for secure version token generation Replaced math/rand with crypto/rand to ensure version tokens are cryptographically secure and unpredictable for optimistic concurrency control. * s3tables: improve account ID handling and define missing error codes Updated getPrincipalFromRequest to prioritize X-Amz-Account-ID header and added getAccountID helper. Defined ErrVersionTokenMismatch and ErrCodeConflict for better optimistic concurrency support. * s3tables: update bucket handlers for multi-account support Ensured bucket ownership is correctly attributed to the authenticated account ID and updated ARNs to use the request-derived account ID. Added standard S3 existence checks for bucket deletion. * s3tables: update namespace handlers for multi-account support Updated namespace creation to use authenticated account ID for ownership and unified permission checks across all namespace operations to use the correct account principal. * s3tables: implement optimistic concurrency for table deletion Added VersionToken validation to handleDeleteTable. Refactored table listing to use request context for accurate ARN generation and fixed cross-namespace pagination issues. * s3tables: improve resource resolution and error mapping for policies and tagging Refactored resolveResourcePath to return resource type, enabling accurate NoSuchBucket vs NoSuchTable error codes. Added existence checks before deleting policies. * s3tables: enhance test robustness and resilience Updated random string generation to use crypto/rand in s3tables tests. Increased resilience of IAM distributed tests by adding "connection refused" to retryable errors. * s3tables: remove legacy principal fallback header Removed the fallback to X-Amz-Principal in getPrincipalFromRequest as S3 Tables is a new feature and does not require legacy header support. * s3tables: remove unused ExtractPrincipalFromContext function Removed the unused ExtractPrincipalFromContext utility and its accompanying iam/utils import to keep the new s3tables codebase clean. * s3tables: allow hyphens in namespace and table names Relaxed regex validation in utils.go to support hyphens in S3 Tables namespaces and table names, improving consistency with S3 bucket naming and allowing derived names from services like S3 Storage Lens. * s3tables: add isAuthError helper to handler.go * s3tables: refactor permission checks to use resource owner in bucket handlers * s3tables: refactor permission checks to use resource owner in namespace handlers * s3tables: refactor permission checks to use resource owner in table handlers * s3tables: refactor permission checks to use resource owner in policy and tagging handlers * ownerAccountID * s3tables: implement strict AWS-aligned name validation for buckets, namespaces, and tables * s3tables: enforce strict resource ownership and implement result filtering for buckets * s3tables: enforce strict resource ownership and implement result filtering for namespaces * s3tables: enforce strict resource ownership and implement result filtering for tables * s3tables: align getPrincipalFromRequest with account ID for IAM compatibility * s3tables: fix inconsistent permission check in handleCreateTableBucket * s3tables: improve pagination robustness and error handling in table listing handlers * s3tables: refactor handleDeleteTableBucket to use strongly typed AuthError * s3tables: align ARN regex patterns with S3 standards and refactor to constants * s3tables: standardize access denied errors using ErrAccessDenied constant * go fmt * s3tables: fix double-write issue in handleListTables Remove premature HTTP error writes from within WithFilerClient closure to prevent duplicate status code responses. Error handling is now consistently performed at the top level using isAuthError. * s3tables: update bucket name validation message Remove "underscores" from error message to accurately reflect that bucket names only allow lowercase letters, numbers, and hyphens. * s3tables: add table policy test coverage Add comprehensive test coverage for table policy operations: - Added PutTablePolicy, GetTablePolicy, DeleteTablePolicy methods to test client - Implemented testTablePolicy lifecycle test validating Put/Get/Delete operations - Verified error handling for missing policies * follow aws spec * s3tables: add request body size limiting Add request body size limiting (10MB) to readRequestBody method: - Define maxRequestBodySize constant to prevent unbounded reads - Use io.LimitReader to enforce size limit - Add explicit error handling for oversized requests - Prevents potential DoS attacks via large request bodies * S3 Tables API now properly enforces resource policies addressing the critical security gap where policies were created but never evaluated. * s3tables: Add upper bound validation for MaxTables parameter MaxTables is user-controlled and influences gRPC ListEntries limits via uint32(maxTables2). Without an upper bound, very large values can overflow uint32 or cause excessively large directory scans. Cap MaxTables to 1000 and return InvalidRequest for out-of-range values, consistent with S3 MaxKeys handling. s3tables: Add upper bound validation for MaxBuckets parameter MaxBuckets is user-controlled and used in uint32(maxBuckets2) for ListEntries. Very large values can overflow uint32 or trigger overly expensive scans. Cap MaxBuckets to 1000 and reject out-of-range values, consistent with MaxTables handling and S3 MaxKeys validation elsewhere in the codebase. s3tables: Validate bucket name in parseBucketNameFromARN() Enforce the same bucket name validation rules (length, characters, reserved prefixes/suffixes) when extracting from ARN. This prevents accepting ARNs that the system would never create and ensures consistency with CreateTableBucket validation. * s3tables: Fix parseTableFromARN() namespace and table name validation - Remove dead URL unescape for namespace (regex [a-z0-9_]+ cannot contain percent-escapes) - Add URL decoding and validation of extracted table name via validateTableName() to prevent callers from bypassing request validation done in other paths * s3tables: Rename tableMetadataInternal.Schema to Metadata The field name 'Schema' was confusing given it holds a TableMetadata struct and serializes as 'metadata' in JSON. Rename to 'Metadata' for clarity and consistency with the JSON tag and intended meaning. s3tables: Improve bucket name validation error message Replace misleading character-only error message with generic 'invalid bucket name'. The isValidBucketName() function checks multiple constraints beyond character set (length, reserved prefixes/suffixes, start/end rules), so a specific character message is inaccurate. * s3tables: Separate permission checks for tagging and untagging - Add CanTagResource() to check TagResource permission - Add CanUntagResource() to check UntagResource permission - Update CanManageTags() to check both operations (OR logic) This prevents UntagResource from incorrectly checking 'ManageTags' permission and ensures each operation validates the correct permission when per-operation permissions are enforced. * s3tables: Consolidate getPrincipalFromRequest and getAccountID into single method Both methods had identical implementations - they return the account ID from request header or fall back to handler's default. Remove the duplicate getPrincipalFromRequest and use getAccountID throughout, with updated comment explaining its dual role as both caller identity and principal for permission checks. * s3tables: Fetch bucket policy in handleListTagsForResource for permission evaluation Update handleListTagsForResource to fetch and pass bucket policy to CheckPermission, matching the behavior of handleTagResource/handleUntagResource. This enables bucket-policy-based permission grants to be evaluated for ListTagsForResource, not just ownership-based checks. * s3tables: Extract resource owner and bucket extraction into helper method Create extractResourceOwnerAndBucket() helper to consolidate the repeated pattern of unmarshaling metadata and extracting bucket name from resource path. This pattern was duplicated in handleTagResource, handleListTagsForResource, and handleUntagResource. Update all three handlers to use the helper. Also update remaining uses of getPrincipalFromRequest() (in handler_bucket_create, handler_bucket_get_list_delete, handler_namespace) to use getAccountID() after consolidating the two identical methods. * s3tables: Add log message when cluster shutdown times out The timeout path (2 second wait for graceful shutdown) was silent. Add a warning log message when it occurs to help diagnose flaky test issues and indicate when the mini cluster didn't shut down cleanly. * s3tables: Use policy_engine wildcard matcher for complete IAM compatibility Replace the custom suffix-only wildcard implementation in matchesActionPattern and matchesPrincipal with the policy_engine.MatchesWildcard function from PR #8052. This enables full wildcard support including: - Middle wildcards: s3tables:GetTable matches GetTable - Question mark wildcards: Get? matches any single character - Combined patterns: s3tables:Table* matches any action containing 'Table' Benefits: - Code reuse: eliminates duplicate wildcard logic - Complete IAM compatibility: supports all AWS wildcard patterns - Performance: uses efficient O(n) backtracking algorithm - Consistency: same wildcard behavior across S3 API and S3 Tables Add comprehensive unit tests covering exact matches, suffix wildcards, middle wildcards, question marks, and combined patterns for both action and principal matching. * go fmt * s3tables: Fix vet error - remove undefined c.t reference in Stop() The TestCluster.Stop() method doesn't have access to testing.T object. Remove the log statement and keep the timeout handling comment for clarity. The original intent (warning about shutdown timeout) is still captured in the code comment explaining potential issues. * clean up * s3tables: Add t field to TestCluster for logging Add testing.T field to TestCluster struct and initialize it in startMiniCluster. This allows Stop() to properly log warnings when cluster shutdown times out. Includes the t field in the test cluster initialization and restores the logging statement in Stop(). s3tables: Fix bucket policy error handling in permission checks Replace error-swallowing pattern where all errors from getExtendedAttribute were ignored for bucket policy reads. Now properly distinguish between: - ErrAttributeNotFound: Policy not found is expected; continue with empty policy - Other errors: Return internal server error and stop processing Applied fix to all bucket policy reads in: - handleDeleteTableBucketPolicy (line 220) - handleTagResource (line 313) - handleUntagResource (line 405) - handleListTagsForResource (line 488) - And additional occurrences in closures This prevents silent failures and ensures policy-related errors are surfaced to callers rather than being silently ignored. * s3tables: Pre-validate namespace to return 400 instead of 500 Move validateNamespace call outside of filerClient.WithFilerClient closure so that validation errors return HTTP 400 (InvalidRequest) instead of 500 (InternalError). Before: Validation error inside closure → treated as internal error → 500 After: Validation error before closure → handled as bad request → 400 This provides correct error semantics: namespace validation is an input validation issue, not a server error. * Update weed/s3api/s3tables/handler.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * s3tables: Normalize action names to include service prefix Add automatic normalization of operations to full IAM-style action names (e.g., 's3tables:CreateTableBucket') in CheckPermission(). This ensures policy statements using prefixed actions (s3tables:) correctly match operations evaluated by permission helpers. Also fixes incorrect r.Context() passed to GetIdentityNameFromContext which expects http.Request. Now passes r directly. * s3tables: Use policy framework for table creation authorization Replace strict ownership check in CreateTable with policy-based authorization. Now checks both namespace and bucket policies for CreateTable permission, allowing delegation via resource policies while still respecting owner bypass. Authorization logic: - Namespace policy grants CreateTable → allowed - Bucket policy grants CreateTable → allowed - Otherwise → denied (even if same owner) This enables cross-principal table creation via policies while maintaining security through explicit allow/deny semantics. * s3tables: Use policy framework for GetTable authorization Replace strict ownership check with policy-based authorization in GetTable. Now checks both table and bucket policies for GetTable permission, allowing authorized non-owners to read table metadata. Authorization logic: - Table policy grants GetTable → allowed - Bucket policy grants GetTable → allowed - Otherwise → 404 NotFound (no access disclosed) Maintains security through policy evaluation while enabling read delegation. * s3tables: Generate ARNs using resource owner account ID Change ARN generation to use resource OwnerAccountID instead of caller identity (h.getAccountID(r)). This ensures ARNs are stable and consistent regardless of which principal accesses the resource. Updated generateTableBucketARN and generateTableARN function signatures to accept ownerAccountID parameter. All call sites updated to pass the resource owner's account ID from metadata. This prevents ARN inconsistency issues when multiple principals have access to the same resource via policies. * s3tables: Fix remaining policy error handling in namespace and bucket handlers Replace silent error swallowing (err == nil) with proper error distinction for bucket policy reads. Now properly checks ErrAttributeNotFound and propagates other errors as internal server errors. Fixed 5 locations: - handleCreateNamespace (policy fetch) - handleDeleteNamespace (policy fetch) - handleListNamespaces (policy fetch) - handleGetNamespace (policy fetch) - handleGetTableBucket (policy fetch) This prevents masking of filer issues when policies cannot be read due to I/O errors or other transient failures. * ci: Pin GitHub Actions to commit SHAs for s3-tables-tests Update all action refs to use pinned commit SHAs instead of floating tags: - actions/checkout: @v6 → @8e8c483 (v4) - actions/setup-go: @v6 → @0c52d54 (v5) - actions/upload-artifact: @v6 → @65d8626 (v4) Pinned SHAs improve reproducibility and reduce supply chain risk by preventing accidental or malicious changes in action releases. Aligns with repository conventions used in other workflows (e.g., go.yml). * s3tables: Add resource ARN validation to policy evaluation Implement resource-specific policy validation to prevent over-broad permission grants. Add matchesResource and matchesResourcePattern functions to validate statement Resource fields against specific resource ARNs. Add new CheckPermissionWithResource function that includes resource ARN validation, while keeping CheckPermission unchanged for backward compatibility. This enables policies to grant access to specific resources only: - statements with Resource: "arn:aws:s3tables:...:bucket/specific-bucket/" will only match when accessing that specific bucket - statements without Resource field match all resources (implicit ) - resource patterns support wildcards (* for any sequence, ? for single char) For future use: Handlers can call CheckPermissionWithResource with the target resource ARN to enforce resource-level access control. * Revert "ci: Pin GitHub Actions to commit SHAs for s3-tables-tests" This reverts commit `01da26fbcb`. * s3tables: Remove duplicate bucket extraction logic in helper Move bucket name extraction outside the if/else block in extractResourceOwnerAndBucket since the logic is identical for both ResourceTypeTable and ResourceTypeBucket cases. This reduces code duplication and improves maintainability. The extraction pattern (parts[1] from /tables/{bucket}/...) works for both resource types, so it's now performed once before the type-specific metadata unmarshaling. * go fmt * s3tables: Fix ownership consistency across handlers Address three related ownership consistency issues: 1. CreateNamespace now sets OwnerAccountID to bucketMetadata.OwnerAccountID instead of request principal. This prevents namespaces created by delegated callers (via bucket policy) from becoming unmanageable, since ListNamespaces filters by bucket owner. 2. CreateTable now: - Fetches bucket metadata to use correct owner for bucket policy evaluation - Uses namespaceMetadata.OwnerAccountID for namespace policy checks - Uses bucketMetadata.OwnerAccountID for bucket policy checks - Sets table OwnerAccountID to namespaceMetadata.OwnerAccountID (inherited) 3. GetTable now: - Fetches bucket metadata to use correct owner for bucket policy evaluation - Uses metadata.OwnerAccountID for table policy checks - Uses bucketMetadata.OwnerAccountID for bucket policy checks This ensures: - Bucket owner retains implicit "owner always allowed" behavior even when evaluating bucket policies - Ownership hierarchy is consistent (namespace owned by bucket, table owned by namespace) - Cross-principal delegation via policies doesn't break ownership chains * s3tables: Fix ListTables authorization and policy parsing Make ListTables authorization consistent with GetTable/CreateTable: 1. ListTables authorization now evaluates policies instead of owner-only checks: - For namespace listing: checks namespace policy AND bucket policy - For bucket-wide listing: checks bucket policy - Uses CanListTables permission framework 2. Remove owner-only filter in listTablesWithClient that prevented policy-based sharing of tables. Authorization is now enforced at the handler level, so all tables in the namespace/bucket are returned to authorized callers (who have access either via ownership or policy). 3. Add flexible PolicyDocument.UnmarshalJSON to support both single-object and array forms of Statement field: - Handles: {"Statement": {...}} - Handles: {"Statement": [{...}, {...}]} - Improves AWS IAM compatibility This ensures cross-account table listing works when delegated via bucket/namespace policies, consistent with the authorization model for other operations. * go fmt * s3tables: Separate table name pattern constant for clarity Define a separate tableNamePatternStr constant for the table name component in the ARN regex, even though it currently has the same value as tableNamespacePatternStr. This improves code clarity and maintainability, making it easier to modify if the naming rules for tables and namespaces diverge in the future. * refactor --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2 months ago
Chris Lu	580c2b4ad4	command: fix stale error variable logging in filer serving goroutines - Use local 'err' variable instead of stale 'e' from outer scope - Applied to both TLS and non-TLS paths for local listener	2 months ago
Chris Lu	01c17478ae	command: implement graceful shutdown for mini cluster - Introduce MiniClusterCtx to coordinate shutdown across mini services - Update Master, Volume, Filer, S3, and WebDAV servers to respect context cancellation - Ensure all resources are cleaned up properly during test teardown - Integrate MiniClusterCtx in s3tables integration tests	2 months ago
Chris Lu	6542d1e0aa	Enable weed fuse on FreeBSD (#8146 ) * Enable weed fuse on FreeBSD * Update weed/command/fuse_notsupported.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Update weed/command/fuse_std.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2 months ago
MorezMartin	20952aa514	Fix jwt error in admin UI (#8140 ) * add jwt token in weed admin headers requests * add jwt token to header for download * :s/upload/download * filer_signing.read despite of filer_signing key * finalize filer_browser_handlers.go * admin: add JWT authorization to file browser handlers * security: fix typos in JWT read validation descriptions * Move security.toml to example and secure keys * security: address PR feedback on JWT enforcement and example keys * security: refactor JWT logic and improve example keys readability * Update docker/Dockerfile.local Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Chris Lu <chris.lu@gmail.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2 months ago

1 2 3 4 5 ...

1392 Commits (master)