* admin: fix file browser page size selector
Fix file browser pagination page-size selectors to use explicit select IDs instead of this.value in templ-generated handlers, which could resolve to undefined and produce limit=undefined in requests.
Add a focused template render regression test to prevent this from recurring.
Fixes#8284
* revert file browser template regression test
* fix ec.encode skipping volumes when one replica is on a full disk
This fixes issue #8218. Previously, ec.encode would skip a volume if ANY
of its replicas resided on a disk with low free volume count. Now it
accepts the volume if AT LEAST ONE replica is on a healthy disk.
* refine noFreeDisk counter logic in ec.encode
Ensure noFreeDisk is decremented if a volume initially marked as bad
is later found to have a healthy replica. This ensures accurate
summary statistics.
* defer noFreeDisk counting and refine logging in ec.encode
Updated logging to be replica-scoped and deferred noFreeDisk counting to
the final pass over vidMap. This ensures that the counter only reflects
volumes that are definitively excluded because all replicas are on full
disks.
* filter replicas by free space during ec.encode
Updated doEcEncode to filter out replicas on disks with
FreeVolumeCount < 2 before selecting the best replica for encoding.
This ensures that EC shards are not generated on healthy source
replicas that happen to be on disks with low free space.
* fix issue #8230: volume.fsck deletion logic to respect purgeAbsent flag
This commit fixes two issues in volume.fsck:
1. Missing chunks in existing volumes are now deleted if -reallyDeleteFilerEntries is set.
2. Missing volumes are now properly handled when a -volumeId filter is specified, allowing deletion of filer entries for those volumes.
* address PR feedback for issue #8230
- Ensure volume filter is applied before reporting missing volumes
- Fix potential nil-pointer dereferences in httpDelete method
- Use proper error checking throughout httpDelete
* address second round PR feedback for issue #8230
- Use fmt.Fprintf(c.writer, ...) instead of fmt.Printf
- Add missing newline in "deleting path" log message
* Add Iceberg table details view
* Enhance Iceberg catalog browsing UI
* Fix Iceberg UI security and logic issues
- Fix selectSchema() and partitionFieldsFromFullMetadata() to always search for matching IDs instead of checking != 0
- Fix snapshotsFromFullMetadata() to defensive-copy before sorting to prevent mutating caller's slice
- Fix XSS vulnerabilities in s3tables.js: replace innerHTML with textContent/createElement for user-controlled data
- Fix deleteIcebergTable() to redirect to namespace tables list on details page instead of reloading
- Fix data-bs-target in iceberg_namespaces.templ: remove templ.SafeURL for CSS selector
- Add catalogName to delete modal data attributes for proper redirect
- Remove unused hidden inputs from create table form (icebergTableBucketArn, icebergTableNamespace)
* Regenerate templ files for Iceberg UI updates
* Support complex Iceberg type objects in schema
Change Type field from string to json.RawMessage in both IcebergSchemaFieldInfo
and internal icebergSchemaField to properly handle Iceberg spec's complex type
objects (e.g. {"type": "struct", "fields": [...]}). Currently test data
only shows primitive string types, but this change makes the implementation
defensively robust for future complex types by preserving the exact JSON
representation. Add typeToString() helper and update schema extraction
functions to marshal string types as JSON. Update template to convert
json.RawMessage to string for display.
* Regenerate templ files for Type field changes
* templ
* Fix additional Iceberg UI issues from code review
- Fix lazy-load flag that was set before async operation completed, preventing retries
on error; now sets loaded flag only after successful load and throws error to caller
for proper error handling and UI updates
- Add zero-time guards for CreatedAt and ModifiedAt fields in table details to avoid
displaying Go zero-time values; render dash when time is zero
- Add URL path escaping for all catalog/namespace/table names in URLs to prevent
malformed URLs when names contain special characters like /, ?, or #
- Remove redundant innerHTML clear in loadIcebergNamespaceTables that cleared twice
before appending the table list
- Fix selectSnapshotForMetrics to remove != 0 guard for consistency with selectSchema
fix; now always searches for CurrentSnapshotID without zero-value gate
- Enhance typeToString() helper to display '(complex)' for non-primitive JSON types
* Regenerate templ files for Phase 3 updates
* Fix template generation to use correct file paths
Run templ generate from repo root instead of weed/admin directory to ensure
generated _templ.go files have correct absolute paths in error messages
(e.g., 'weed/admin/view/app/iceberg_table_details.templ' instead of
'app/iceberg_table_details.templ'). This ensures both 'make admin-generate'
at repo root and 'make generate' in weed/admin directory produce identical
output with consistent file path references.
* Regenerate template files with correct path references
* Validate S3 Tables names in UI
- Add client-side validation for table bucket and namespace names to surface
errors for invalid characters (dots/underscores) before submission
- Use HTML validity messages with reportValidity for immediate feedback
- Update namespace helper text to reflect actual constraints (single-level,
lowercase letters, numbers, and underscores)
* Regenerate templ files for namespace helper text
* Fix Iceberg catalog REST link and actions
* Disallow S3 object access on table buckets
* Validate Iceberg layout for table bucket objects
* Fix REST API link to /v1/config
* merge iceberg page with table bucket page
* Allowed Trino/Iceberg stats files in metadata validation
* fixes
- Backend/data handling:
- Normalized Iceberg type display and fallback handling in weed/admin/dash/s3tables_management.go.
- Fixed snapshot fallback pointer semantics in weed/admin/dash/s3tables_management.go.
- Added CSRF token generation/propagation/validation for namespace create/delete in:
- weed/admin/dash/csrf.go
- weed/admin/dash/auth_middleware.go
- weed/admin/dash/middleware.go
- weed/admin/dash/s3tables_management.go
- weed/admin/view/layout/layout.templ
- weed/admin/static/js/s3tables.js
- UI/template fixes:
- Zero-time guards for CreatedAt fields in:
- weed/admin/view/app/iceberg_namespaces.templ
- weed/admin/view/app/iceberg_tables.templ
- Fixed invalid templ-in-script interpolation and host/port rendering in:
- weed/admin/view/app/iceberg_catalog.templ
- weed/admin/view/app/s3tables_buckets.templ
- Added data-catalog-name consistency on Iceberg delete action in weed/admin/view/app/iceberg_tables.templ.
- Updated retry wording in weed/admin/static/js/s3tables.js.
- Regenerated all affected _templ.go files.
- S3 API/comment follow-ups:
- Reused cached table-bucket validator in weed/s3api/bucket_paths.go.
- Added validation-failure debug logging in weed/s3api/s3api_object_handlers_tagging.go.
- Added multipart path-validation design comment in weed/s3api/s3api_object_handlers_multipart.go.
- Build tooling:
- Fixed templ generate working directory issues in weed/admin/Makefile (watch + pattern rule).
* populate data
* test/s3tables: harden populate service checks
* admin: skip table buckets in object-store bucket list
* admin sidebar: move object store to top-level links
* admin iceberg catalog: guard zero times and escape links
* admin forms: add csrf/error handling and client-side name validation
* admin s3tables: fix namespace delete modal redeclaration
* admin: replace native confirm dialogs with modal helpers
* admin modal-alerts: remove noisy confirm usage console log
* reduce logs
* test/s3tables: use partitioned tables in trino and spark populate
* admin file browser: normalize filer ServerAddress for HTTP parsing
* Fix disk errors handling in vacuum compaction
When a disk reports IO errors during vacuum compaction (e.g., 'read /mnt/d1/weed/oc_xyz.dat: input/output error'), the vacuum task should signal the error to the master so it can:
1. Drop the faulty volume replica
2. Rebuild the replica from healthy copies
Changes:
- Add checkReadWriteError() calls in vacuum read paths (ReadNeedleBlob, ReadData, ScanVolumeFile) to flag EIO errors in volume.lastIoError
- Preserve error wrapping using %w format instead of %v so EIO propagates correctly
- The existing heartbeat logic will detect lastIoError and remove the bad volume
Fixes issue #8237
* error
* s3: fix health check endpoints returning 404 for HEAD requests #8243
* Add Spark Iceberg catalog integration tests and CI support
Implement comprehensive integration tests for Spark with SeaweedFS Iceberg REST catalog:
- Basic CRUD operations (Create, Read, Update, Delete) on Iceberg tables
- Namespace (database) management
- Data insertion, querying, and deletion
- Time travel capabilities via snapshot versioning
- Compatible with SeaweedFS S3 and Iceberg REST endpoints
Tests mirror the structure of existing Trino integration tests but use Spark's
Python SQL API and PySpark for testing.
Add GitHub Actions CI job for spark-iceberg-catalog-tests in s3-tables-tests.yml
to automatically run Spark integration tests on pull requests.
* fmt
* Fix Spark integration tests - code review feedback
* go mod tidy
* Add go mod tidy step to integration test jobs
Add 'go mod tidy' step before test runs for all integration test jobs:
- s3-tables-tests
- iceberg-catalog-tests
- trino-iceberg-catalog-tests
- spark-iceberg-catalog-tests
This ensures dependencies are clean before running tests.
* Fix remaining Spark operations test issues
Address final code review comments:
Setup & Initialization:
- Add waitForSparkReady() helper function that polls Spark readiness
with backoff instead of hardcoded 10-second sleep
- Extract setupSparkTestEnv() helper to reduce boilerplate duplication
between TestSparkCatalogBasicOperations and TestSparkTimeTravel
- Both tests now use helpers for consistent, reliable setup
Assertions & Validation:
- Make setup-critical operations (namespace, table creation, initial
insert) use t.Fatalf instead of t.Errorf to fail fast
- Validate setupSQL output in TestSparkTimeTravel and fail if not
'Setup complete'
- Add validation after second INSERT in TestSparkTimeTravel:
verify row count increased to 2 before time travel test
- Add context to error messages with namespace and tableName params
Code Quality:
- Remove code duplication between test functions
- All critical paths now properly validated
- Consistent error handling throughout
* Fix go vet errors in S3 Tables tests
Fixes:
1. setup_test.go (Spark):
- Add missing import: github.com/testcontainers/testcontainers-go/wait
- Use wait.ForLog instead of undefined testcontainers.NewLogStrategy
- Remove unused strings import
2. trino_catalog_test.go:
- Use net.JoinHostPort instead of fmt.Sprintf for address formatting
- Properly handles IPv6 addresses by wrapping them in brackets
* Use weed mini for simpler SeaweedFS startup
Replace complex multi-process startup (master, volume, filer, s3)
with single 'weed mini' command that starts all services together.
Benefits:
- Simpler, more reliable startup
- Single weed mini process vs 4 separate processes
- Automatic coordination between components
- Better port management with no manual coordination
Changes:
- Remove separate master, volume, filer process startup
- Use weed mini with -master.port, -filer.port, -s3.port flags
- Keep Iceberg REST as separate service (still needed)
- Increase timeout to 15s for port readiness (weed mini startup)
- Remove volumePort and filerProcess fields from TestEnvironment
- Simplify cleanup to only handle two processes (mini, iceberg rest)
* Clean up dead code and temp directory leaks
Fixes:
1. Remove dead s3Process field and cleanup:
- weed mini bundles S3 gateway, no separate process needed
- Removed s3Process field from TestEnvironment
- Removed unnecessary s3Process cleanup code
2. Fix temp config directory leak:
- Add sparkConfigDir field to TestEnvironment
- Store returned configDir in writeSparkConfig
- Clean up sparkConfigDir in Cleanup() with os.RemoveAll
- Prevents accumulation of temp directories in test runs
3. Simplify Cleanup:
- Now handles only necessary processes (weed mini, iceberg rest)
- Removes both seaweedfsDataDir and sparkConfigDir
- Cleaner shutdown sequence
* Use weed mini's built-in Iceberg REST and fix python binary
Changes:
- Add -s3.port.iceberg flag to weed mini for built-in Iceberg REST Catalog
- Remove separate 'weed server' process for Iceberg REST
- Remove icebergRestProcess field from TestEnvironment
- Simplify Cleanup() to only manage weed mini + Spark
- Add port readiness check for iceberg REST from weed mini
- Set Spark container Cmd to '/bin/sh -c sleep 3600' to keep it running
- Change python to python3 in container.Exec calls
This simplifies to truly one all-in-one weed mini process (master, filer, s3,
iceberg-rest) plus just the Spark container.
* go fmt
* clean up
* bind on a non-loopback IP for container access, aligned Iceberg metadata saves/locations with table locations, and reworked Spark time travel to use TIMESTAMP AS OF with safe timestamp extraction.
* shared mini start
* Fixed internal directory creation under /buckets so .objects paths can auto-create without failing bucket-name validation, which restores table bucket object writes
* fix path
Updated table bucket objects to write under `/buckets/<bucket>` and saved Iceberg metadata there, adjusting Spark time-travel timestamp to committed_at +1s. Rebuilt the weed binary (`go
install ./weed`) and confirmed passing tests for Spark and Trino with focused test commands.
* Updated table bucket creation to stop creating /buckets/.objects and switched Trino REST warehouse to s3://<bucket> to match Iceberg layout.
* Stabilize S3Tables integration tests
* Fix timestamp extraction and remove dead code in bucketDir
* Use table bucket as warehouse in s3tables tests
* Update trino_blog_operations_test.go
* adds the CASCADE option to handle any remaining table metadata/files in the schema directory
* skip namespace not empty
When a disk reports IO errors during vacuum compaction (e.g., 'read /mnt/d1/weed/oc_xyz.dat: input/output error'), the vacuum task should signal the error to the master so it can:
1. Drop the faulty volume replica
2. Rebuild the replica from healthy copies
Changes:
- Add checkReadWriteError() calls in vacuum read paths (ReadNeedleBlob, ReadData, ScanVolumeFile) to flag EIO errors in volume.lastIoError
- Preserve error wrapping using %w format instead of %v so EIO propagates correctly
- The existing heartbeat logic will detect lastIoError and remove the bad volume
Fixes issue #8237
* Add Trino blog operations test
* Update test/s3tables/catalog_trino/trino_blog_operations_test.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* feat: add table bucket path helpers and filer operations
- Add table object root and table location mapping directories
- Implement ensureDirectory, upsertFile, deleteEntryIfExists helpers
- Support table location bucket mapping for S3 access
* feat: manage table bucket object roots on creation/deletion
- Create .objects directory for table buckets on creation
- Clean up table object bucket paths on deletion
- Enable S3 operations on table bucket object roots
* feat: add table location mapping for Iceberg REST
- Track table location bucket mappings when tables are created/updated/deleted
- Enable location-based routing for S3 operations on table data
* feat: route S3 operations to table bucket object roots
- Route table-s3 bucket names to mapped table paths
- Route table buckets to object root directories
- Support table location bucket mapping lookup
* feat: emit table-s3 locations from Iceberg REST
- Generate unique table-s3 bucket names with UUID suffix
- Store table metadata under table bucket paths
- Return table-s3 locations for Trino compatibility
* fix: handle missing directories in S3 list operations
- Propagate ErrNotFound from ListEntries for non-existent directories
- Treat missing directories as empty results for list operations
- Fixes Trino non-empty location checks on table creation
* test: improve Trino CSV parsing for single-value results
- Sanitize Trino output to skip jline warnings
- Handle single-value CSV results without header rows
- Strip quotes from numeric values in tests
* refactor: use bucket path helpers throughout S3 API
- Replace direct bucket path operations with helper functions
- Leverage centralized table bucket routing logic
- Improve maintainability with consistent path resolution
* fix: add table bucket cache and improve filer error handling
- Cache table bucket lookups to reduce filer overhead on repeated checks
- Use filer_pb.CreateEntry and filer_pb.UpdateEntry helpers to check resp.Error
- Fix delete order in handler_bucket_get_list_delete: delete table object before directory
- Make location mapping errors best-effort: log and continue, don't fail API
- Update table location mappings to delete stale prior bucket mappings on update
- Add 1-second sleep before timestamp time travel query to ensure timestamps are in past
- Fix CSV parsing: examine all lines, not skip first; handle single-value rows
* fix: properly handle stale metadata location mapping cleanup
- Capture oldMetadataLocation before mutation in handleUpdateTable
- Update updateTableLocationMapping to accept both old and new locations
- Use passed-in oldMetadataLocation to detect location changes
- Delete stale mapping only when location actually changes
- Pass empty string for oldLocation in handleCreateTable (new tables have no prior mapping)
- Improve logging to show old -> new location transitions
* refactor: cleanup imports and cache design
- Remove unused 'sync' import from bucket_paths.go
- Use filer_pb.UpdateEntry helper in setExtendedAttribute and deleteExtendedAttribute for consistent error handling
- Add dedicated tableBucketCache map[string]bool to BucketRegistry instead of mixing concerns with metadataCache
- Improve cache separation: table buckets cache is now separate from bucket metadata cache
* fix: improve cache invalidation and add transient error handling
Cache invalidation (critical fix):
- Add tableLocationCache to BucketRegistry for location mapping lookups
- Clear tableBucketCache and tableLocationCache in RemoveBucketMetadata
- Prevents stale cache entries when buckets are deleted/recreated
Transient error handling:
- Only cache table bucket lookups when conclusive (found or ErrNotFound)
- Skip caching on transient errors (network, permission, etc)
- Prevents marking real table buckets as non-table due to transient failures
Performance optimization:
- Cache tableLocationDir results to avoid repeated filer RPCs on hot paths
- tableLocationDir now checks cache before making expensive filer lookups
- Cache stores empty string for 'not found' to avoid redundant lookups
Code clarity:
- Add comment to deleteDirectory explaining DeleteEntry response lacks Error field
* go fmt
* fix: mirror transient error handling in tableLocationDir and optimize bucketDir
Transient error handling:
- tableLocationDir now only caches definitive results
- Mirrors isTableBucket behavior to prevent treating transient errors as permanent misses
- Improves reliability on flaky systems or during recovery
Performance optimization:
- bucketDir avoids redundant isTableBucket call via bucketRoot
- Directly use s3a.option.BucketsPath for regular buckets
- Saves one cache lookup for every non-table bucket operation
* fix: revert bucketDir optimization to preserve bucketRoot logic
The optimization to directly use BucketsPath bypassed bucketRoot's logic
and caused issues with S3 list operations on delimiter+prefix cases.
Revert to using path.Join(s3a.bucketRoot(bucket), bucket) which properly
handles all bucket types and ensures consistent path resolution across
the codebase.
The slight performance cost of an extra cache lookup is worth the correctness
and consistency benefits.
* feat: move table buckets under /buckets
Add a table-bucket marker attribute, reuse bucket metadata cache for table bucket detection, and update list/validation/UI/test paths to treat table buckets as /buckets entries.
* Fix S3 Tables code review issues
- handler_bucket_create.go: Fix bucket existence check to properly validate
entryResp.Entry before setting s3BucketExists flag (nil Entry should not
indicate existing bucket)
- bucket_paths.go: Add clarifying comment to bucketRoot() explaining unified
buckets root path for all bucket types
- file_browser_data.go: Optimize by extracting table bucket check early to
avoid redundant WithFilerClient call
* Fix list prefix delimiter handling
* Handle list errors conservatively
* Fix Trino FOR TIMESTAMP query - use past timestamp
Iceberg requires the timestamp to be strictly in the past.
Use current_timestamp - interval '1' second instead of current_timestamp.
---------
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>