* Fix: Initialize filer CredentialManager with filer address
* The fix involves checking for directory existence before creation.
* adjust error message
* Fix: Implement FilerAddressSetter in PropagatingCredentialStore
* Refactor: Reorder credential manager initialization in filer server
* refactor
* full integration with iceberg-go
* Table Commit Operations (handleUpdateTable)
* s3tables: fix Iceberg v2 compliance and namespace properties
This commit ensures SeaweedFS Iceberg REST Catalog is compliant with
Iceberg Format Version 2 by:
- Using iceberg-go's table.NewMetadataWithUUID for strict v2 compliance.
- Explicitly initializing namespace properties to empty maps.
- Removing omitempty from required Iceberg response fields.
- Fixing CommitTableRequest unmarshaling using table.Requirements and table.Updates.
* s3tables: automate Iceberg integration tests
- Added Makefile for local test execution and cluster management.
- Added docker-compose for PyIceberg compatibility kit.
- Added Go integration test harness for PyIceberg.
- Updated GitHub CI to run Iceberg catalog tests automatically.
* s3tables: update PyIceberg test suite for compatibility
- Updated test_rest_catalog.py to use latest PyIceberg transaction APIs.
- Updated Dockerfile to include pyarrow and pandas dependencies.
- Improved namespace and table handling in integration tests.
* s3tables: address review feedback on Iceberg Catalog
- Implemented robust metadata version parsing and incrementing.
- Ensured table metadata changes are persisted during commit (handleUpdateTable).
- Standardized namespace property initialization for consistency.
- Fixed unused variable and incorrect struct field build errors.
* s3tables: finalize Iceberg REST Catalog and optimize tests
- Implemented robust metadata versioning and persistence.
- Standardized namespace property initialization.
- Optimized integration tests using pre-built Docker image.
- Added strict property persistence validation to test suite.
- Fixed build errors from previous partial updates.
* Address PR review: fix Table UUID stability, implement S3Tables UpdateTable, and support full metadata persistence individually
* fix: Iceberg catalog stable UUIDs, metadata persistence, and file writing
- Ensure table UUIDs are stable (do not regenerate on load).
- Persist full table metadata (Iceberg JSON) in s3tables extended attributes.
- Add `MetadataVersion` to explicitly track version numbers, replacing regex parsing.
- Implement `saveMetadataFile` to persist metadata JSON files to the Filer on commit.
- Update `CreateTable` and `UpdateTable` handlers to use the new logic.
* test: bind weed mini to 0.0.0.0 in integration tests to fix Docker connectivity
* Iceberg: fix metadata handling in REST catalog
- Add nil guard in createTable
- Fix updateTable to correctly load existing metadata from storage
- Ensure full metadata persistence on updates
- Populate loadTable result with parsed metadata
* S3Tables: add auth checks and fix response fields in UpdateTable
- Add CheckPermissionWithContext to UpdateTable handler
- Include TableARN and MetadataLocation in UpdateTable response
- Use ErrCodeConflict (409) for version token mismatches
* Tests: improve Iceberg catalog test infrastructure and cleanup
- Makefile: use PID file for precise process killing
- test_rest_catalog.py: remove unused variables and fix f-strings
* Iceberg: fix variable shadowing in UpdateTable
- Rename inner loop variable `req` to `requirement` to avoid shadowing outer request variable
* S3Tables: simplify MetadataVersion initialization
- Use `max(req.MetadataVersion, 1)` instead of anonymous function
* Tests: remove unicode characters from S3 tables integration test logs
- Remove unicode checkmarks from test output for cleaner logs
* Iceberg: improve metadata persistence robustness
- Fix MetadataLocation in LoadTableResult to fallback to generated location
- Improve saveMetadataFile to ensure directory hierarchy existence and robust error handling
* helm: add Iceberg REST catalog support to S3 service
* helm: add Iceberg REST catalog support to S3 service
---------
Co-authored-by: yalin.sahin <yalin.sahin@tradition.ch>
* s3: enforce authentication and JSON error format for Iceberg REST Catalog
* s3/iceberg: align error exception types with OpenAPI spec examples
* s3api: refactor AuthenticateRequest to return identity object
* s3/iceberg: propagate full identity object to request context
* s3/iceberg: differentiate NotAuthorizedException and ForbiddenException
* s3/iceberg: reject requests if authenticator is nil to prevent auth bypass
* s3/iceberg: refactor Auth middleware to build context incrementally and use switch for error mapping
* s3api: update misleading comment for authRequestWithAuthType
* s3api: return ErrAccessDenied if IAM is not configured to prevent auth bypass
* s3/iceberg: optimize context update in Auth middleware
* s3api: export CanDo for external authorization use
* s3/iceberg: enforce identity-based authorization in all API handlers
* s3api: fix compilation errors by updating internal CanDo references
* s3/iceberg: robust identity validation and consistent action usage in handlers
* s3api: complete CanDo rename across tests and policy engine integration
* s3api: fix integration tests by allowing admin access when auth is disabled and explicit gRPC ports
* duckdb
* create test bucket
* feat: Add Iceberg REST Catalog server
Implement Iceberg REST Catalog API on a separate port (default 8181)
that exposes S3 Tables metadata through the Apache Iceberg REST protocol.
- Add new weed/s3api/iceberg package with REST handlers
- Implement /v1/config endpoint returning catalog configuration
- Implement namespace endpoints (list/create/get/head/delete)
- Implement table endpoints (list/create/load/head/delete/update)
- Add -port.iceberg flag to S3 standalone server (s3.go)
- Add -s3.port.iceberg flag to combined server mode (server.go)
- Add -s3.port.iceberg flag to mini cluster mode (mini.go)
- Support prefix-based routing for multiple catalogs
The Iceberg REST server reuses S3 Tables metadata storage under
/table-buckets and enables DuckDB, Spark, and other Iceberg clients
to connect to SeaweedFS as a catalog.
* feat: Add Iceberg Catalog pages to admin UI
Add admin UI pages to browse Iceberg catalogs, namespaces, and tables.
- Add Iceberg Catalog menu item under Object Store navigation
- Create iceberg_catalog.templ showing catalog overview with REST info
- Create iceberg_namespaces.templ listing namespaces in a catalog
- Create iceberg_tables.templ listing tables in a namespace
- Add handlers and routes in admin_handlers.go
- Add Iceberg data provider methods in s3tables_management.go
- Add Iceberg data types in types.go
The Iceberg Catalog pages provide visibility into the same S3 Tables
data through an Iceberg-centric lens, including REST endpoint examples
for DuckDB and PyIceberg.
* test: Add Iceberg catalog integration tests and reorg s3tables tests
- Reorganize existing s3tables tests to test/s3tables/table-buckets/
- Add new test/s3tables/catalog/ for Iceberg REST catalog tests
- Add TestIcebergConfig to verify /v1/config endpoint
- Add TestIcebergNamespaces to verify namespace listing
- Add TestDuckDBIntegration for DuckDB connectivity (requires Docker)
- Update CI workflow to use new test paths
* fix: Generate proper random UUIDs for Iceberg tables
Address code review feedback:
- Replace placeholder UUID with crypto/rand-based UUID v4 generation
- Add detailed TODO comments for handleUpdateTable stub explaining
the required atomic metadata swap implementation
* fix: Serve Iceberg on localhost listener when binding to different interface
Address code review feedback: properly serve the localhost listener
when the Iceberg server is bound to a non-localhost interface.
* ci: Add Iceberg catalog integration tests to CI
Add new job to run Iceberg catalog tests in CI, along with:
- Iceberg package build verification
- Iceberg unit tests
- Iceberg go vet checks
- Iceberg format checks
* fix: Address code review feedback for Iceberg implementation
- fix: Replace hardcoded account ID with s3_constants.AccountAdminId in buildTableBucketARN()
- fix: Improve UUID generation error handling with deterministic fallback (timestamp + PID + counter)
- fix: Update handleUpdateTable to return HTTP 501 Not Implemented instead of fake success
- fix: Better error handling in handleNamespaceExists to distinguish 404 from 500 errors
- fix: Use relative URL in template instead of hardcoded localhost:8181
- fix: Add HTTP timeout to test's waitForService function to avoid hangs
- fix: Use dynamic ephemeral ports in integration tests to avoid flaky parallel failures
- fix: Add Iceberg port to final port configuration logging in mini.go
* fix: Address critical issues in Iceberg implementation
- fix: Cache table UUIDs to ensure persistence across LoadTable calls
The UUID now remains stable for the lifetime of the server session.
TODO: For production, UUIDs should be persisted in S3 Tables metadata.
- fix: Remove redundant URL-encoded namespace parsing
mux router already decodes %1F to \x1F before passing to handlers.
Redundant ReplaceAll call could cause bugs with literal %1F in namespace.
* fix: Improve test robustness and reduce code duplication
- fix: Make DuckDB test more robust by failing on unexpected errors
Instead of silently logging errors, now explicitly check for expected
conditions (extension not available) and skip the test appropriately.
- fix: Extract username helper method to reduce duplication
Created getUsername() helper in AdminHandlers to avoid duplicating
the username retrieval logic across Iceberg page handlers.
* fix: Add mutex protection to table UUID cache
Protects concurrent access to the tableUUIDs map with sync.RWMutex.
Uses read-lock for fast path when UUID already cached, and write-lock
for generating new UUIDs. Includes double-check pattern to handle race
condition between read-unlock and write-lock.
* style: fix go fmt errors
* feat(iceberg): persist table UUID in S3 Tables metadata
* feat(admin): configure Iceberg port in Admin UI and commands
* refactor: address review comments (flags, tests, handlers)
- command/mini: fix tracking of explicit s3.port.iceberg flag
- command/admin: add explicit -iceberg.port flag
- admin/handlers: reuse getUsername helper
- tests: use 127.0.0.1 for ephemeral ports and os.Stat for file size check
* test: check error from FileStat in verify_gc_empty_test
* Implement RPC skeleton for regular/EC volumes scrubbing.
See https://github.com/seaweedfs/seaweedfs/issues/8018 for details.
* Minor proto improvements for `ScrubVolume()`, `ScrubEcVolume()`:
- Add fields for scrubbing details in `ScrubVolumeResponse` and `ScrubEcVolumeResponse`,
instead of reporting these through RPC errors.
- Return a list of broken shards when scrubbing EC volumes, via `EcShardInfo'.
* Boostrap persistent state for volume servers.
This PR implements logic load/save persistent state information for storages
associated with volume servers, and reporting state changes back to masters
via heartbeat messages.
More work ensues!
See https://github.com/seaweedfs/seaweedfs/issues/7977 for details.
* Add volume server RPCs to read and update state flags.
* Boostrap persistent state for volume servers.
This PR implements logic load/save persistent state information for storages
associated with volume servers, and reporting state changes back to masters
via heartbeat messages.
More work ensues!
See https://github.com/seaweedfs/seaweedfs/issues/7977 for details.
* Block RPC operations writing to volume servers when maintenance mode is on.
* fix: skip exhausted blocks before creating an interval
* refactor: optimize interval creation and fix logic duplication
* docs: add docstring for LocateData
* refactor: extract moveToNextBlock helper to deduplicate logic
* fix: use int64 for block index comparison to prevent overflow
* test: add unit test for LocateData boundary crossing (issue #8179)
* fix: skip exhausted blocks to prevent negative interval size and panics (issue #8179)
* refactor: apply review suggestions for test maintainability and code style
* s3tables: add Iceberg file layout validation for table buckets
This PR adds file layout validation for table buckets to enforce Apache
Iceberg table structure. Files uploaded to table buckets must conform
to the expected Iceberg layout:
- metadata/ directory: contains metadata files (*.json, *.avro)
- v*.metadata.json (table metadata)
- snap-*.avro (snapshot manifests)
- *-m*.avro (manifest files)
- version-hint.text
- data/ directory: contains data files (*.parquet, *.orc, *.avro)
- Supports partition paths (e.g., year=2024/month=01/)
- Supports bucket subdirectories
The validator exports functions for use by the S3 API:
- IsTableBucketPath: checks if a path is under /table-buckets/
- GetTableInfoFromPath: extracts bucket/namespace/table from path
- ValidateTableBucketUpload: validates file layout for table bucket uploads
- ValidateTableBucketUploadWithClient: validates with filer client access
Invalid uploads receive InvalidIcebergLayout error response.
* Address review comments: regex performance, error handling, stricter patterns
* Fix validateMetadataFile and validateDataFile to handle subdirectories and directory creation
* Fix error handling, metadata validation, reduce code duplication
* Fix empty remainingPath handling for directory paths
* Refactor: unify validateMetadataFile and validateDataFile
* Refactor: extract UUID pattern constant
* fix: allow Iceberg partition and directory paths without trailing slashes
Modified validateFile to correctly handle directory paths that do not end with a trailing slash.
This ensures that paths like 'data/year=2024' are validated as directories if they match
partition or subdirectory patterns, rather than being incorrectly rejected as invalid files.
Added comprehensive test cases for various directory and partition path combinations.
* refactor: use standard path package and idiomatic returns
Simplified directory and filename extraction in validateFile by using the standard
path package (aliased as pathpkg). This improves readability and avoids manual string
manipulation. Also updated GetTableInfoFromPath to use naked returns for named
return values, aligning with Go conventions for short functions.
* feat: enforce strict Iceberg top-level directories and metadata restrictions
Implemented strict validation for Iceberg layout:
- Bare top-level keys like 'metadata' and 'data' are now rejected; they must have
a trailing slash or a subpath.
- Subdirectories under 'metadata/' are now prohibited to enforce the flat structure
required by Iceberg.
- Updated the test suite with negative test cases and ensured proper formatting.
* feat: allow table root directory markers in ValidateTableBucketUpload
Modified ValidateTableBucketUpload to short-circuit and return nil when the
relative path within a table is empty. This occurs when a trailing slash is
used on the table directory (e.g., /table-buckets/mybucket/myns/mytable/).
Added a test case 'table dir with slash' to verify this behavior.
* test: add regression cases for metadata subdirs and table markers
Enforced a strictly flat structure for the metadata directory by removing the
"directory without trailing slash" fallback in validateFile for metadata.
Added regression test cases:
- metadata/nested (must fail)
- /table-buckets/.../mytable/ (must pass)
Verified all tests pass.
* feat: reject double slashes in Iceberg table paths
Modified validateDirectoryPath to return an error when encountering empty path
segments, effectively rejecting double slashes like 'data//file.parquet'.
Updated validateFile to use manual path splitting instead of the 'path' package
for intermediate directories to ensure redundant slashes are not auto-cleaned
before validation. Added regression tests for various double slash scenarios.
* refactor: separate isMetadata logic in validateDirectoryPath
Following reviewer feedback, refactored validateDirectoryPath to explicitly
separate the handling of metadata and data paths. This improves readability
and clarifies the function's intent while maintaining the strict validation rules
and double-slash rejection previously implemented.
* feat: validate bucket, namespace, and table path segments
Updated ValidateTableBucketUpload to ensure that bucket, namespace, and table
segments in the path are non-empty. This prevents invalid paths like
'/table-buckets//myns/mytable/...' from being accepted during upload.
Added regression tests for various empty segment scenarios.
* Update weed/s3api/s3tables/iceberg_layout.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* feat: block double-slash bypass in table relative paths
Added a guard in ValidateTableBucketUpload to reject tableRelativePath if it
starts with a '/' or contains '//'. This ensures that paths like
'/table-buckets/b/ns/t//data/file.parquet' are properly rejected and cannot
bypass the layout validation. Added regression tests to verify.
---------
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* mount: refresh and evict hot dir cache
* mount: guard dir update window and extend TTL
* mount: reuse timestamp for cache mark
* Apply suggestion from @gemini-code-assist[bot]
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* mount: make dir cache tuning configurable
* mount: dedupe dir update notices
* mount: restore invalidate-all cache helper
* mount: keep hot dir tuning constants
* mount: centralize cache state reset
* mount: mark refresh completion time
* mount: allow disabling idle eviction
---------
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* Add shared s3tables manager
* Add s3tables shell commands
* Add s3tables admin API
* Add s3tables admin UI
* Fix admin s3tables namespace create
* Rename table buckets menu
* Centralize s3tables tag validation
* Reuse s3tables manager in admin
* Extract s3tables list limit
* Add s3tables bucket ARN helper
* Remove write middleware from s3tables APIs
* Fix bucket link and policy hint
* Fix table tag parsing and nav link
* Disable namespace table link on invalid ARN
* Improve s3tables error decode
* Return flag parse errors for s3tables tag
* Accept query params for namespace create
* Bind namespace create form data
* Read s3tables JS data from DOM
* s3tables: allow empty region ARN
* shell: pass s3tables account id
* shell: require account for table buckets
* shell: use bucket name for namespaces
* shell: use bucket name for tables
* shell: use bucket name for tags
* admin: add table buckets links in file browser
* s3api: reuse s3tables tag validation
* admin: harden s3tables UI handlers
* fix admin list table buckets
* allow admin s3tables access
* validate s3tables bucket tags
* log s3tables bucket metadata errors
* rollback table bucket on owner failure
* show s3tables bucket owner
* add s3tables iam conditions
* Add s3tables user permissions UI
* Authorize s3tables using identity actions
* Add s3tables permissions to user modal
* Disambiguate bucket scope in user permissions
* Block table bucket names that match S3 buckets
* Pretty-print IAM identity JSON
* Include tags in s3tables permission context
* admin: refactor S3 Tables inline JavaScript into a separate file
* s3tables: extend IAM policy condition operators support
* shell: use LookupEntry wrapper for s3tables bucket conflict check
* admin: handle buildBucketPermissions validation in create/update flows
* fix Filer startup failure due to JWT on / path #8149
- Comment out JWT keys in security.toml.example
- Revert Dockerfile.local change that enabled security by default
- Exempt GET/HEAD on / from JWT check for health checks
* refactor: simplify JWT bypass condition as per PR feedback
* iam: add ECDSA support for OIDC token validation
Fixesseaweedfs/seaweedfs#8148
* iam: refactor OIDC ECDSA tests and add failure cases
- Refactored TestOIDCProviderJWTValidationECDSA to use t.Run
- Added sub-tests for expired token, wrong key, invalid issuer, and invalid audience
* Update weed/iam/oidc/oidc_provider_test.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* iam: improve error type assertions for OIDC invalid signature tests
- Updated both RSA and ECDSA tests to specifically check for ErrProviderInvalidToken
* iam: pad EC coordinates in OIDC tests to comply with RFC 7518
- Coordinates are now zero-padded to the full field size (e.g., 32 bytes for P-256)
- Ensures interoperability with strict OIDC providers
---------
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* shell: allow spaces in arguments via quoting (#8157)
- updated argument splitting regex to handle quoted segments
- added robust quote stripping to remove matching quotes from flags
- added unit tests for regex splitting and flag parsing
* shell: use robust state machine parser for command line arguments
- replaced regex-based splitter with splitCommandLine state machine
- added escape character support in splitCommandLine and stripQuotes
- updated unit tests to include escaped quotes and single-quote literals
- addressed feedback regarding escaped quotes handling (#8157)
* shell: detect unbalanced quotes in stripQuotes
- modified stripQuotes to return the original string if quotes are unbalanced
- added test cases for unbalanced quotes in shell_liner_test.go
* shell: refactor shared parsing logic into parseShellInput helper
- unified splitting and unquoting logic into a single state machine
- splitCommandLine now returns unquoted tokens directly
- simplified processEachCmd by removing redundant unquoting loop
- improved maintainability by eliminating code duplication
* shell: detect trailing backslash in stripQuotes
- updated parseShellInput to include escaped state in unbalanced flag
- stripQuotes now returns original string if it ends with an unescaped backslash
- added test case for trailing backslash in shell_liner_test.go
Refactored doTraverseBfsAndSaving to use context cancellation.
If the saving process fails, the traversal is stopped immediately
to prevent workers from blocking on the output channel.
This ensures that metadata load correctly restores entries with their
original TTL instead of overwriting with the default from filer.conf.
Fixes#8159.
* s3api: ensure MD5 is calculated or reused during CopyObject
Fixes#8155
- Capture and reuse source MD5 for direct copies
- Calculate MD5 for small inline objects during copy
* s3api: refactor encryption logic and safe MD5 copying
- Extract duplicated bucket default encryption logic into helper
- Use safe append copy for MD5 slice to avoid shared modifications
* refactor
* avoids unnecessary MD5 recalculations for small files
* fix: correct chunk size in encrypted uploads
When uploading with encryption enabled (-encryptVolumeData flag),
the chunk metadata was incorrectly storing the encrypted data size
instead of the original uncompressed size.
This caused data corruption when reading files back because:
1. Encrypted data is larger than original data
2. After decryption and decompression, the actual data is smaller
3. Size validation in readEncryptedUrl would fail
4. Files appeared truncated with null bytes
The fix ensures uploadResult.Size uses clearDataLen (the original
uncompressed size) in both cipher and non-cipher code paths,
matching the expected behavior.
Fixes: #8151
* test: add comprehensive tests for encrypted upload size fix
Add unit tests to verify that encrypted uploads correctly store the
original uncompressed data size, not the encrypted size.
Tests cover:
- Cipher key generation and encryption/decryption roundtrip
- Compression + encryption roundtrip with size preservation
- Different data sizes
- Compression detection and effectiveness
- Size behavior demonstration showing the bug and fix
All tests verify that metadata stores the original size (19 bytes in
the key example) not the encrypted size (75 bytes), which is critical
for preventing data corruption on read operations.
Issue: #8151
* remove emojis
* Update upload_content_test.go
* Delete upload_content_test.go
Capture global MiniClusterCtx into local variables before goroutine/select
evaluation to prevent nil dereference/data race when context is reset to nil
after nil check. Applied to filer, master, volume, and s3 commands.
Fetch and evaluate table policies in DeleteTable handler to support policy-based
delegation. Aligns authorization behavior with GetTable and ListTables handlers
instead of only checking ownership.
Explicitly handle ErrAttributeNotFound vs other errors when fetching bucket policy.
Return errors for non-expected failures to prevent masking filer issues and
ensure correct authorization decisions.
Combine length, character, and reserved pattern validation into validateBucketName()
which returns descriptive error messages. Keep isValidBucketName() for backward
compatibility. This simplifies handler validation and provides better error reporting.