* Add volume dir tags to topology
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add preferred tag config for EC
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Prioritize EC destinations by tags
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add EC placement planner tag tests
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Refactor EC placement tests to reuse buildActiveTopology
Remove buildActiveTopologyWithDiskTags helper function and consolidate
tag setup inline in test cases. Tests now use UpdateTopology to apply
tags after topology creation, reusing the existing buildActiveTopology
function rather than duplicating its logic.
All tag scenario tests pass:
- TestECPlacementPlannerPrefersTaggedDisks
- TestECPlacementPlannerFallsBackWhenTagsInsufficient
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Consolidate normalizeTagList into shared util package
Extract normalizeTagList from three locations (volume.go,
detection.go, erasure_coding_handler.go) into new weed/util/tag.go
as exported NormalizeTagList function. Replace all duplicate
implementations with imports and calls to util.NormalizeTagList.
This improves code reuse and maintainability by centralizing
tag normalization logic.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add PreferredTags to EC config persistence
Add preferred_tags field to ErasureCodingTaskConfig protobuf with field
number 5. Update GetConfigSpec to include preferred_tags field in the
UI configuration schema. Add PreferredTags to ToTaskPolicy to serialize
config to protobuf. Add PreferredTags to FromTaskPolicy to deserialize
from protobuf with defensive copy to prevent external mutation.
This allows EC preferred tags to be persisted and restored across
worker restarts.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add defensive copy for Tags slice in DiskLocation
Copy the incoming tags slice in NewDiskLocation instead of storing
by reference. This prevents external callers from mutating the
DiskLocation.Tags slice after construction, improving encapsulation
and preventing unexpected changes to disk metadata.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add doc comment to buildCandidateSets method
Document the tiered candidate selection and fallback behavior. Explain
that for a planner with preferredTags, it accumulates disks matching
each tag in order into progressively larger tiers, emits a candidate
set once a tier reaches shardsNeeded, and finally falls back to the
full candidates set if preferred-tag tiers are insufficient.
This clarifies the intended semantics for future maintainers.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Apply final PR review fixes
1. Update parseVolumeTags to replicate single tag entry to all folders
instead of leaving some folders with nil tags. This prevents nil
pointer dereferences when processing folders without explicit tags.
2. Add defensive copy in ToTaskPolicy for PreferredTags slice to match
the pattern used in FromTaskPolicy, preventing external mutation of
the returned TaskPolicy.
3. Add clarifying comment in buildCandidateSets explaining that the
shardsNeeded <= 0 branch is a defensive check for direct callers,
since selectDestinations guarantees shardsNeeded > 0.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix nil pointer dereference in parseVolumeTags
Ensure all folder tags are initialized to either normalized tags or
empty slices, not nil. When multiple tag entries are provided and there
are more folders than entries, remaining folders now get empty slices
instead of nil, preventing nil pointer dereference in downstream code.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix NormalizeTagList to return empty slice instead of nil
Change NormalizeTagList to always return a non-nil slice. When all tags
are empty or whitespace after normalization, return an empty slice
instead of nil. This prevents nil pointer dereferences in downstream
code that expects a valid (possibly empty) slice.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add nil safety check for v.tags pointer
Add a safety check to handle the case where v.tags might be nil,
preventing a nil pointer dereference. If v.tags is nil, use an empty
string instead. This is defensive programming to prevent panics in
edge cases.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add volume.tags flag to weed server and weed mini commands
Add the volume.tags CLI option to both the 'weed server' and 'weed mini'
commands. This allows users to specify disk tags when running the
combined server modes, just like they can with 'weed volume'.
The flag uses the same format and description as the volume command:
comma-separated tag groups per data dir with ':' separators
(e.g. fast:ssd,archive).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* refactor ec shard distribution
* fix shard assignment merge and mount errors
* fix mount error aggregation scope
* make WithFields compatible and wrap errors
* Prevent concurrent maintenance tasks per volume
* fix panic
* fix(s3api): correctly extract host header port when X-Forwarded-Port is present
* test(s3api): add test cases for misreported X-Forwarded-Port
* set working directory
* consolidate to worker directory
* working directory
* correct directory name
* refactoring to use wildcard matcher
* simplify
* cleaning ec working directory
* fix reference
* clean
* adjust test
* feat: add customizable plugin display names and weights
- Add weight field to JobTypeCapability proto message
- Modify ListKnownJobTypes() to return JobTypeInfo with display names and weights
- Modify ListPluginJobTypes() to return JobTypeInfo instead of string
- Sort plugins by weight (descending) then alphabetically
- Update admin API to return enriched job type metadata
- Update plugin UI template to display names instead of IDs
- Consolidate API by reusing existing function names instead of suffixed variants
* perf: optimize plugin job type capability lookup and add null-safe parsing
- Pre-calculate job type capabilities in a map to reduce O(n*m) nested loops
to O(n+m) lookup time in ListKnownJobTypes()
- Add parseJobTypeItem() helper function for null-safe job type item parsing
- Refactor plugin.templ to use parseJobTypeItem() in all job type access points
(hasJobType, applyInitialNavigation, ensureActiveNavigation, renderTopTabs)
- Deterministic capability resolution by using first worker's capability
* templ
* refactor: use parseJobTypeItem helper consistently in plugin.templ
Replace duplicated job type extraction logic at line 1296-1298 with
parseJobTypeItem() helper function for consistency and maintainability.
* improve: prefer richer capability metadata and add null-safety checks
- Improve capability selection in ListKnownJobTypes() to prefer capabilities
with non-empty DisplayName and higher Weight across all workers instead of
first-wins approach. Handles mixed-version clusters better.
- Add defensive null checks in renderJobTypeSummary() to safely access
parseJobTypeItem() result before property access
- Ensures malformed or missing entries won't break the rendering pipeline
* fix: preserve existing DisplayName when merging capabilities
Fix capability merge logic to respect existing DisplayName values:
- If existing has DisplayName but candidate doesn't, preserve existing
- If existing doesn't have DisplayName but candidate does, use candidate
- Only use Weight comparison if DisplayName status is equal
- Prevents higher-weight capabilities with empty DisplayName from
overriding capabilities with non-empty DisplayName
* feat: drop table location mapping support
Disable external metadata locations for S3 Tables and remove the table location
mapping index entirely. Table metadata must live under the table bucket paths,
so lookups no longer use mapping directories.
Changes:
- Remove mapping lookup and cache from bucket path resolution
- Reject metadataLocation in CreateTable and UpdateTable
- Remove mapping helpers and tests
* compile
* refactor
* fix: accept metadataLocation in S3 Tables API requests
We removed the external table location mapping feature, but still need to
accept and store metadataLocation values from clients like Trino. The mapping
feature was an internal implementation detail that mapped external buckets to
internal table paths. The metadataLocation field itself is part of the S3 Tables
API and should be preserved.
* fmt
* fix: handle MetadataLocation in UpdateTable requests
Mirror handleCreateTable behavior by updating metadata.MetadataLocation
when req.MetadataLocation is provided in UpdateTable requests. This ensures
table metadata location can be updated, not just set during creation.
* fix: move table location mappings to /etc/s3tables to avoid bucket name validation
Fixes#8362 - table location mappings were stored under /buckets/.table-location-mappings
which fails bucket name validation because it starts with a dot. Moving them to
/etc/s3tables resolves the migration error for upgrades.
Changes:
- Table location mappings now stored under /etc/s3tables
- Ensure parent /etc directory exists before creating /etc/s3tables
- Normal writes go to new location only (no legacy compatibility)
- Removed bucket name validation exception for old location
* refactor: simplify lookupTableLocationMapping by removing redundant mappingPath parameter
The mappingPath function parameter was redundant as the path can be derived
from mappingDir and bucket using path.Join. This simplifies the code and
reduces the risk of path mismatches between parameters.
* Fix S3 signature verification behind reverse proxies
When SeaweedFS is deployed behind a reverse proxy (e.g. nginx, Kong,
Traefik), AWS S3 Signature V4 verification fails because the Host header
the client signed with (e.g. "localhost:9000") differs from the Host
header SeaweedFS receives on the backend (e.g. "seaweedfs:8333").
This commit adds a new -s3.externalUrl parameter (and S3_EXTERNAL_URL
environment variable) that tells SeaweedFS what public-facing URL clients
use to connect. When set, SeaweedFS uses this host value for signature
verification instead of the Host header from the incoming request.
New parameter:
-s3.externalUrl (flag) or S3_EXTERNAL_URL (environment variable)
Example: -s3.externalUrl=http://localhost:9000
Example: S3_EXTERNAL_URL=https://s3.example.com
The environment variable is particularly useful in Docker/Kubernetes
deployments where the external URL is injected via container config.
The flag takes precedence over the environment variable when both are set.
At startup, the URL is parsed and default ports are stripped to match
AWS SDK behavior (port 80 for HTTP, port 443 for HTTPS), so
"http://s3.example.com:80" and "http://s3.example.com" are equivalent.
Bugs fixed:
- Default port stripping was removed by a prior PR, causing signature
mismatches when clients connect on standard ports (80/443)
- X-Forwarded-Port was ignored when X-Forwarded-Host was not present
- Scheme detection now uses proper precedence: X-Forwarded-Proto >
TLS connection > URL scheme > "http"
- Test expectations for standard port stripping were incorrect
- expectedHost field in TestSignatureV4WithForwardedPort was declared
but never actually checked (self-referential test)
* Add Docker integration test for S3 proxy signature verification
Docker Compose setup with nginx reverse proxy to validate that the
-s3.externalUrl parameter (or S3_EXTERNAL_URL env var) correctly
resolves S3 signature verification when SeaweedFS runs behind a proxy.
The test uses nginx proxying port 9000 to SeaweedFS on port 8333,
with X-Forwarded-Host/Port/Proto headers set. SeaweedFS is configured
with -s3.externalUrl=http://localhost:9000 so it uses "localhost:9000"
for signature verification, matching what the AWS CLI signs with.
The test can be run with aws CLI on the host or without it by using
the amazon/aws-cli Docker image with --network host.
Test covers: create-bucket, list-buckets, put-object, head-object,
list-objects-v2, get-object, content round-trip integrity,
delete-object, and delete-bucket — all through the reverse proxy.
* Create s3-proxy-signature-tests.yml
* fix CLI
* fix CI
* Update s3-proxy-signature-tests.yml
* address comments
* Update Dockerfile
* add user
* no need for fuse
* Update s3-proxy-signature-tests.yml
* debug
* weed mini
* fix health check
* health check
* fix health checking
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
Add cosi.bucketClassParameters to allow passing arbitrary parameters
to the default BucketClass resource. This enables use cases like
tiered storage where a diskType parameter needs to be set on the
BucketClass to route objects to specific volume servers.
When bucketClassParameters is empty (default), the BucketClass is
rendered without a parameters block, preserving backward compatibility.
Signed-off-by: Kirill Ilin <stitch14@yandex.ru>
Co-authored-by: Claude <noreply@anthropic.com>
Some filesystems, such as XFS, may over-allocate disk spaces when using
volume preallocation. Remove this option from the default docker entrypoint
scripts to allow volumes to use only the necessary disk space.
Fixes: https://github.com/seaweedfs/seaweedfs/issues/6465#issuecomment-3964174718
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Make EC detection context aware
* Update register.go
* Speed up EC detection planning
* Add tests for EC detection planner
* optimizations
detection.go: extracted ParseCollectionFilter (exported) and feed it into the detection loop so both detection and tracing share the same parsing/whitelisting logic; the detection loop now iterates on a sorted list of volume IDs, checks the context at every iteration, and only sets hasMore when there are still unprocessed groups after hitting maxResults, keeping runtime bounded while still scheduling planned tasks before returning the results.
erasure_coding_handler.go: dropped the duplicated inline filter parsing in emitErasureCodingDetectionDecisionTrace and now reuse erasurecodingtask.ParseCollectionFilter, and the summary suffix logic now only accounts for the hasMore case that can actually happen.
detection_test.go: updated the helper topology builder to use master_pb.VolumeInformationMessage (matching the current protobuf types) and tightened the cancellation/max-results tests so they reliably exercise the detection logic (cancel before calling Detection, and provide enough disks so one result is produced before the limit).
* use working directory
* fix compilation
* fix compilation
* rename
* go vet
* fix getenv
* address comments, fix error
* Fix SFTP file upload failures with JWT filer tokens (issue #8425)
When JWT authentication is enabled for filer operations via jwt.filer_signing.*
configuration, SFTP server file upload requests were rejected because they lacked
JWT authorization headers.
Changes:
- Added JWT signing key and expiration fields to SftpServer struct
- Modified putFile() to generate and include JWT tokens in upload requests
- Enhanced SFTPServiceOptions with JWT configuration fields
- Updated SFTP command startup to load and pass JWT config to service
This allows SFTP uploads to authenticate with JWT-enabled filers, consistent
with how other SeaweedFS components (S3 API, file browser) handle filer auth.
Fixes#8425
* Apply suggestion from @gemini-code-assist[bot]
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
---------
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
TestS3MultipartOperationsInheritPutObjectPermissions verifies that multipart
upload operations (CreateMultipartUpload, UploadPart, ListParts,
CompleteMultipartUpload, AbortMultipartUpload, ListMultipartUploads) work
correctly when a user has only s3:PutObject permission granted.
This test validates the behavior where multipart operations are implicitly
granted when s3:PutObject is authorized, as multipart upload is an
implementation detail of putting objects in S3.
* Allow multipart upload operations when s3:PutObject is authorized
Multipart upload is an implementation detail of putting objects, not a separate
permission. When a policy grants s3:PutObject, it should implicitly allow:
- s3:CreateMultipartUpload
- s3:UploadPart
- s3:CompleteMultipartUpload
- s3:AbortMultipartUpload
- s3:ListParts
This fixes a compatibility issue where clients like PyArrow that use multipart
uploads by default would fail even though the role had s3:PutObject permission.
The session policy intersection still applies - both the identity-based policy
AND session policy must allow s3:PutObject for multipart operations to work.
Implementation:
- Added constants for S3 multipart action strings
- Added multipartActionSet to efficiently check if action is multipart-related
- Updated MatchesAction method to implicitly grant multipart when PutObject allowed
* Update weed/s3api/policy_engine/types.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* Add s3:ListMultipartUploads to multipart action set
Include s3:ListMultipartUploads in the multipartActionSet so that listing
multipart uploads is implicitly granted when s3:PutObject is authorized.
ListMultipartUploads is a critical part of the multipart upload workflow,
allowing clients to query in-progress uploads before completing them.
Changes:
- Added s3ListMultipartUploads constant definition
- Included s3ListMultipartUploads in multipartActionSet initialization
- Existing references to multipartActionSet automatically now cover ListMultipartUploads
All policy engine tests pass (0.351s execution time)
* Refactor: reuse multipart action constants from s3_constants package
Remove duplicate constant definitions from policy_engine/types.go and import
the canonical definitions from s3api/s3_constants/s3_actions.go instead.
This eliminates duplication and ensures a single source of truth for
multipart action strings:
- ACTION_CREATE_MULTIPART_UPLOAD
- ACTION_UPLOAD_PART
- ACTION_COMPLETE_MULTIPART
- ACTION_ABORT_MULTIPART
- ACTION_LIST_PARTS
- ACTION_LIST_MULTIPART_UPLOADS
All policy engine tests pass (0.350s execution time)
* Fix S3_ACTION_LIST_MULTIPART_UPLOADS constant value
Move S3_ACTION_LIST_MULTIPART_UPLOADS from bucket operations to multipart
operations section and change value from 's3:ListBucketMultipartUploads' to
's3:ListMultipartUploads' to match the action strings used in policy_engine
and s3_actions.go.
This ensures consistent action naming across all S3 constant definitions.
* refactor names
* Fix S3 action constant mismatches and MatchesAction early return bug
Fix two critical issues in policy engine:
1. S3Actions map had incorrect multipart action mappings:
- 'ListMultipartUploads' was 's3:ListMultipartUploads' (should be 's3:ListBucketMultipartUploads')
- 'ListParts' was 's3:ListParts' (should be 's3:ListMultipartUploadParts')
These mismatches caused authorization checks to fail for list operations
2. CompiledStatement.MatchesAction() had early return bug:
- Previously returned true immediately upon first direct action match
- This prevented scanning remaining matchers for s3:PutObject permission
- Now scans ALL matchers before returning, tracking both direct match and PutObject grant
- Ensures multipart operations inherit s3:PutObject authorization even when
explicitly requested action doesn't match (e.g., s3:ListMultipartUploadParts)
Changes:
- Track matchedAction flag to defer
Fix two critical issues in policy engine:
1. S3Actions map had incorrect multipart action mappings:
- 'ListMultipartUploads' was 's3:ListMultipartUplPer
1. S3Actions map had incorrect multiparAll - 'ListMultipartUploads(0.334s execution time)
* Refactor S3Actions map to use s3_constants
Replace hardcoded action strings in the S3Actions map with references to
canonical S3_ACTION_* constants from s3_constants/s3_action_strings.go.
Benefits:
- Single source of truth for S3 action values
- Eliminates string duplication across codebase
- Ensures consistency between policy engine and middleware
- Reduces maintenance burden when action strings need updates
All policy engine tests pass (0.334s execution time)
* Remove unused S3Actions map
The S3Actions map in types.go was never referenced anywhere in the codebase.
All action mappings are handled by GetActionMappings() in integration.go instead.
This removes 42 lines of dead code.
* Fix test: reload configuration function must also reload IAM state
TestEmbeddedIamAttachUserPolicyRefreshesIAM was failing because the test's
reloadConfigurationFunc only updated mockConfig but didn't reload the actual IAM
state. When AttachUserPolicy calls refreshIAMConfiguration(), it would use the
test's incomplete reload function instead of the real LoadS3ApiConfigurationFromCredentialManager().
Fixed by making the test's reloadConfigurationFunc also call
e.iam.LoadS3ApiConfigurationFromCredentialManager() so lookupByIdentityName()
sees the updated policy attachments.
---------
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>