seaweedfs

Commit Graph

Author	SHA1	Message	Date
Chris Lu	14c863dbff	docs(volume-server): refocus plan on native rust parity	4 weeks ago
Chris Lu	6bb9d8bac2	docs(volume_server): log head readDeleted parity coverage	4 weeks ago
Chris Lu	cc80ad3643	test(volume_server/http): add head readDeleted parity coverage	4 weeks ago
Chris Lu	9009e38f7b	docs(volume_server): log ping volume-server unreachable coverage	4 weeks ago
Chris Lu	b9fbb85af2	test(volume_server/grpc): add ping unreachable volume-server target case	4 weeks ago
Chris Lu	47d3001572	docs(volume_server): log csv query payload parity coverage	4 weeks ago
Chris Lu	a12dd5f8d3	test(volume_server/grpc): cover csv-query payload no-output parity	4 weeks ago
Chris Lu	8e614486a3	docs(volume_server): log tail-receiver interruption coverage	4 weeks ago
Chris Lu	a5864c3eb6	test(volume_server/grpc): cover tail-receiver source-unavailable branch	4 weeks ago
Chris Lu	6302809442	docs(volume_server): log tail sender cancellation coverage	4 weeks ago
Chris Lu	27a80f7607	test(volume_server/grpc): add tail-sender cancellation interruption coverage	4 weeks ago
Chris Lu	ec429e0361	docs(volume_server): log framework port-range hardening and rerun	4 weeks ago
Chris Lu	90e82b15ce	test(volume_server/framework): allocate volume ports within safe grpc-offset range	4 weeks ago
Chris Lu	a3e1ee1653	docs(volume_server): log mkcol method parity coverage	4 weeks ago
Chris Lu	2ab30900d4	test(volume_server/http): add mkcol unsupported-method parity	4 weeks ago
Chris Lu	62ee14fa61	docs(volume_server): log read-all-needles multi-volume coverage	4 weeks ago
Chris Lu	ab95a6ef15	test(volume_server/grpc): cover read-all-needles multi-volume success	4 weeks ago
Chris Lu	24965fd489	docs(volume_server): log head conditional precedence coverage	4 weeks ago
Chris Lu	ed23e290fc	test(volume_server/http): expand head conditional precedence coverage	4 weeks ago
Chris Lu	9b57fb6961	docs(volume_server): log ec batch delete success coverage	4 weeks ago
Chris Lu	1bb40b6bc5	test(volume_server/grpc): add ec batch delete success coverage	4 weeks ago
Chris Lu	34e342da63	docs(volume_server): log replicated write failure coverage	4 weeks ago
Chris Lu	4835d34438	test(volume_server/http): cover replicated write failure when replication unmet	4 weeks ago
Chris Lu	5814729def	docs(volume_server): log ec-only read meta coverage	4 weeks ago
Chris Lu	37bf9b5ebf	test(volume_server/grpc): cover ec-only read needle meta unsupported path	4 weeks ago
Chris Lu	19201df6d7	docs(volume_server): log oversized upload limit coverage	4 weeks ago
Chris Lu	4d61cbdeed	test(volume_server/http): cover oversized upload file-size limit rejection	4 weeks ago
Chris Lu	3ce883624e	docs(volume_server): log jwt ui access override coverage	4 weeks ago
Chris Lu	de974c05d5	test(volume_server/http): cover jwt ui access override behavior	4 weeks ago
Chris Lu	7768fda023	docs(volume_server): record proxy-mode validation and CI matrix	4 weeks ago
Chris Lu	6ce4d7eded	docs(volume_server): record rust-mode full-suite validation	4 weeks ago
Chris Lu	d402573ea8	docs(volume_server): document rust-mode harness and tracking	4 weeks ago
Chris Lu	7beab85c21	test(volume_server/framework): support selectable volume server binary	4 weeks ago
Chris Lu	f44e25b422	fix(iam): ensure access key status is persisted and defaulted to Active (#8341 ) * Fix master leader election startup issue Fixes #error-log-leader-not-selected-yet * not useful test * fix(iam): ensure access key status is persisted and defaulted to Active * make pb * update tests * using constants	4 weeks ago
dependabot[bot]	35b6e895cc	build(deps): bump org.apache.avro:avro from 1.11.4 to 1.11.5 in /test/kafka/kafka-client-loadtest/tools (#8339 ) build(deps): bump org.apache.avro:avro Bumps org.apache.avro:avro from 1.11.4 to 1.11.5. --- updated-dependencies: - dependency-name: org.apache.avro:avro dependency-version: 1.11.5 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	1 month ago
Chris Lu	49a64f50f1	Add session policy support to IAM (#8338 ) * Add session policy support to IAM - Implement policy evaluation for session tokens in policy_engine.go - Add session_policy field to session claims for tracking applied policies - Update STS service to include session policies in token generation - Add IAM integration tests for session policy validation - Update IAM manager to support policy attachment to sessions - Extend S3 API STS endpoint to handle session policy restrictions * fix: optimize session policy evaluation and add documentation * sts: add NormalizeSessionPolicy helper for inline session policies * sts: support inline session policies for AssumeRoleWithWebIdentity and credential-based flows * s3api: parse and normalize Policy parameter for STS HTTP handlers * tests: add session policy unit tests and integration tests for inline policy downscoping * tests: add s3tables STS inline policy integration * iam: handle user principals and validate tokens * sts: enforce inline session policy size limit * tests: harden s3tables STS integration config * iam: clarify principal policy resolution errors * tests: improve STS integration endpoint selection	1 month ago
Chris Lu	beeb375a88	Add volume server integration test suite and CI workflow (#8322 ) * docs(volume_server): add integration test development plan * test(volume_server): add integration harness and profile matrix * test(volume_server/http): add admin and options integration coverage * test(volume_server/grpc): add state and status integration coverage * test(volume_server): auto-build weed binary and harden cluster startup * test(volume_server/http): add upload read range head delete coverage * test(volume_server/grpc): expand admin lifecycle and state coverage * docs(volume_server): update progress tracker for implemented tests * test(volume_server/http): cover if-none-match and invalid-range branches * test(volume_server/grpc): add batch delete integration coverage * docs(volume_server): log latest HTTP and gRPC test coverage * ci(volume_server): run volume server integration tests in github actions * test(volume_server/grpc): add needle status configure ping and leave coverage * docs(volume_server): record additional grpc coverage progress * test(volume_server/grpc): add vacuum integration coverage * docs(volume_server): record vacuum test coverage progress * test(volume_server/grpc): add read and write needle blob error-path coverage * docs(volume_server): record data rw grpc coverage progress * test(volume_server/http): add jwt auth integration coverage * test(volume_server/grpc): add sync copy and stream error-path coverage * docs(volume_server): record jwt and sync/copy test coverage * test(volume_server/grpc): add scrub and query integration coverage * test(volume_server/grpc): add volume tail sender and receiver coverage * docs(volume_server): record scrub query and tail test progress * test(volume_server/grpc): add readonly writable and collection lifecycle coverage * test(volume_server/http): add public-port cors and method parity coverage * test(volume_server/grpc): add blob meta and read-all success path coverage * test(volume_server/grpc): expand scrub and query variation coverage * test(volume_server/grpc): add tiering and remote fetch error-path coverage * test(volume_server/http): add unchanged write and delete edge-case coverage * test(volume_server/grpc): add ping unknown and unreachable target coverage * test(volume_server/grpc): add volume delete only-empty variation coverage * test(volume_server/http): add jwt fid-mismatch auth coverage * test(volume_server/grpc): add scrub ec auto-select empty coverage * test(volume_server/grpc): stabilize ping timestamp assertion * docs(volume_server): update integration coverage progress log * test(volume_server/grpc): add tier remote backend and config variation coverage * docs(volume_server): record tier remote variation progress * test(volume_server/grpc): add incremental copy and receive-file protocol coverage * test(volume_server/http): add read path shape and if-modified-since coverage * test(volume_server/grpc): add copy-file compaction and receive-file success coverage * test(volume_server/http): add passthrough headers and static asset coverage * test(volume_server/grpc): add ping filer unreachable coverage * docs(volume_server): record copy receive and http variant progress * test(volume_server/grpc): add erasure coding maintenance and missing-path coverage * docs(volume_server): record initial erasure coding rpc coverage * test(volume_server/http): add multi-range multipart response coverage * docs(volume_server): record multi-range http coverage progress * test(volume_server/grpc): add query empty-stripe no-match coverage * docs(volume_server): record query no-match stream behavior coverage * test(volume_server/http): add upload throttling timeout and replicate bypass coverage * docs(volume_server): record upload throttling coverage progress * test(volume_server/http): add download throttling timeout coverage * docs(volume_server): record download throttling coverage progress * test(volume_server/http): add jwt wrong-cookie fid mismatch coverage * docs(volume_server): record jwt wrong-cookie mismatch coverage * test(volume_server/http): add jwt expired-token rejection coverage * docs(volume_server): record jwt expired-token coverage * test(volume_server/http): add jwt query and cookie transport coverage * docs(volume_server): record jwt token transport coverage * test(volume_server/http): add jwt token-source precedence coverage * docs(volume_server): record jwt token-source precedence coverage * test(volume_server/http): add jwt header-over-cookie precedence coverage * docs(volume_server): record jwt header cookie precedence coverage * test(volume_server/http): add jwt query-over-cookie precedence coverage * docs(volume_server): record jwt query cookie precedence coverage * test(volume_server/grpc): add setstate version mismatch and nil-state coverage * docs(volume_server): record setstate validation coverage * test(volume_server/grpc): add readonly persist-true lifecycle coverage * docs(volume_server): record readonly persist variation coverage * test(volume_server/http): add options origin cors header coverage * docs(volume_server): record options origin cors coverage * test(volume_server/http): add trace unsupported-method parity coverage * docs(volume_server): record trace method parity coverage * test(volume_server/grpc): add batch delete cookie-check variation coverage * docs(volume_server): record batch delete cookie-check coverage * test(volume_server/grpc): add admin lifecycle missing and maintenance variants * docs(volume_server): record admin lifecycle edge-case coverage * test(volume_server/grpc): add mixed batch delete status matrix coverage * docs(volume_server): record mixed batch delete matrix coverage * test(volume_server/http): add jwt-profile ui access gating coverage * docs(volume_server): record jwt ui-gating http coverage * test(volume_server/http): add propfind unsupported-method parity coverage * docs(volume_server): record propfind method parity coverage * test(volume_server/grpc): add volume configure success and rollback-path coverage * docs(volume_server): record volume configure branch coverage * test(volume_server/grpc): add volume needle status missing-path coverage * docs(volume_server): record volume needle status error-path coverage * test(volume_server/http): add readDeleted query behavior coverage * docs(volume_server): record readDeleted http behavior coverage * test(volume_server/http): add delete ts override parity coverage * docs(volume_server): record delete ts parity coverage * test(volume_server/grpc): add invalid blob/meta offset coverage * docs(volume_server): record invalid blob/meta offset coverage * test(volume_server/grpc): add read-all mixed volume abort coverage * docs(volume_server): record read-all mixed-volume abort coverage * test(volume_server/http): assert head response body parity * docs(volume_server): record head body parity assertion * test(volume_server/grpc): assert status state and memory payload completeness * docs(volume_server): record volume server status payload coverage * test(volume_server/grpc): add batch delete chunk-manifest rejection coverage * docs(volume_server): record batch delete chunk-manifest coverage * test(volume_server/grpc): add query cookie-mismatch eof parity coverage * docs(volume_server): record query cookie-mismatch parity coverage * test(volume_server/grpc): add ping master success target coverage * docs(volume_server): record ping master success coverage * test(volume_server/http): add head if-none-match conditional parity * docs(volume_server): record head if-none-match parity coverage * test(volume_server/http): add head if-modified-since parity coverage * docs(volume_server): record head if-modified-since parity coverage * test(volume_server/http): add connect unsupported-method parity coverage * docs(volume_server): record connect method parity coverage * test(volume_server/http): assert options allow-headers cors parity * docs(volume_server): record options allow-headers coverage * test(volume_server/framework): add dual volume cluster integration harness * test(volume_server/http): add missing-local read mode proxy redirect local coverage * docs(volume_server): record read mode missing-local matrix coverage * test(volume_server/http): add download over-limit replica proxy fallback coverage * docs(volume_server): record download replica fallback coverage * test(volume_server/http): add missing-local readDeleted proxy redirect parity coverage * docs(volume_server): record missing-local readDeleted mode coverage * test(volume_server/framework): add single-volume cluster with filer harness * test(volume_server/grpc): add ping filer success target coverage * docs(volume_server): record ping filer success coverage * test(volume_server/http): add proxied-loop guard download timeout coverage * docs(volume_server): record proxied-loop download coverage * test(volume_server/http): add disabled upload and download limit coverage * docs(volume_server): record disabled throttling path coverage * test(volume_server/grpc): add idempotent volume server leave coverage * docs(volume_server): record leave idempotence coverage * test(volume_server/http): add redirect collection query preservation coverage * docs(volume_server): record redirect collection query coverage * test(volume_server/http): assert admin server headers on status and health * docs(volume_server): record admin server header coverage * test(volume_server/http): assert healthz request-id echo parity * docs(volume_server): record healthz request-id parity coverage * test(volume_server/http): add over-limit invalid-vid download branch coverage * docs(volume_server): record over-limit invalid-vid branch coverage * test(volume_server/http): add public-port static asset coverage * docs(volume_server): record public static endpoint coverage * test(volume_server/http): add public head method parity coverage * docs(volume_server): record public head parity coverage * test(volume_server/http): add throttling wait-then-proceed path coverage * docs(volume_server): record throttling wait-then-proceed coverage * test(volume_server/http): add read cookie-mismatch not-found coverage * docs(volume_server): record read cookie-mismatch coverage * test(volume_server/http): add throttling timeout-recovery coverage * docs(volume_server): record throttling timeout-recovery coverage * test(volume_server/grpc): add ec generate mount info unmount lifecycle coverage * docs(volume_server): record ec positive lifecycle coverage * test(volume_server/grpc): add ec shard read and blob delete lifecycle coverage * docs(volume_server): record ec shard read/blob delete lifecycle coverage * test(volume_server/grpc): add ec rebuild and to-volume error branch coverage * docs(volume_server): record ec rebuild and to-volume branch coverage * test(volume_server/grpc): add ec shards-to-volume success roundtrip coverage * docs(volume_server): record ec shards-to-volume success coverage * test(volume_server/grpc): add ec receive and copy-file missing-source coverage * docs(volume_server): record ec receive and copy-file coverage * test(volume_server/grpc): add ec last-shard delete cleanup coverage * docs(volume_server): record ec last-shard delete cleanup coverage * test(volume_server/grpc): add volume copy success path coverage * docs(volume_server): record volume copy success coverage * test(volume_server/grpc): add volume copy overwrite-destination coverage * docs(volume_server): record volume copy overwrite coverage * test(volume_server/http): add write error-path variant coverage * docs(volume_server): record http write error-path coverage * test(volume_server/http): add conditional header precedence coverage * docs(volume_server): record conditional header precedence coverage * test(volume_server/http): add oversized combined range guard coverage * docs(volume_server): record oversized range guard coverage * test(volume_server/http): add image resize and crop read coverage * docs(volume_server): record image transform coverage * test(volume_server/http): add chunk-manifest expansion and bypass coverage * docs(volume_server): record chunk-manifest read coverage * test(volume_server/http): add compressed read encoding matrix coverage * docs(volume_server): record compressed read matrix coverage * test(volume_server/grpc): add tail receiver source replication coverage * docs(volume_server): record tail receiver replication coverage * test(volume_server/grpc): add tail sender large-needle chunking coverage * docs(volume_server): record tail sender chunking coverage * test(volume_server/grpc): add ec-backed volume needle status coverage * docs(volume_server): record ec-backed needle status coverage * test(volume_server/grpc): add ec shard copy from peer success coverage * docs(volume_server): record ec shard copy success coverage * test(volume_server/http): add chunk-manifest delete child cleanup coverage * docs(volume_server): record chunk-manifest delete cleanup coverage * test(volume_server/http): add chunk-manifest delete failure-path coverage * docs(volume_server): record chunk-manifest delete failure coverage * test(volume_server/grpc): add ec shard copy source-unavailable coverage * docs(volume_server): record ec shard copy source-unavailable coverage * parallel	1 month ago
Chris Lu	796f23f68a	Fix STS InvalidAccessKeyId and request body consumption issues (#8328 ) * Fix STS InvalidAccessKeyId and request body consumption in Lakekeeper integration test * Remove debug prints * Add Lakekeeper integration tests to CI * Fix connection refused in CI by binding to 0.0.0.0 * Add timeout to docker run in Lakekeeper integration test * Update weed/s3api/auth_credentials.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	1 month ago
Chris Lu	25ea48227f	Fix STS temporary credentials to use ASIA prefix instead of AKIA (#8326 ) Temporary credentials from STS AssumeRole were using "AKIA" prefix (permanent IAM user credentials) instead of "ASIA" prefix (temporary security credentials). This violates AWS conventions and may cause compatibility issues with AWS SDKs that validate credential types. Changes: - Rename generateAccessKeyId to generateTemporaryAccessKeyId for clarity - Update function to use ASIA prefix for temporary credentials - Add unit tests to verify ASIA prefix format (weed/iam/sts/credential_prefix_test.go) - Add integration test to verify ASIA prefix in S3 API (test/s3/iam/s3_sts_credential_prefix_test.go) - Ensure AWS-compatible credential format (ASIA + 16 hex chars) The credentials are already deterministic (SHA256-based from session ID) and the SessionToken is correctly set to the JWT token, so this is just a prefix fix to follow AWS standards. Fixes #8312	1 month ago
Chris Lu	0082c47e04	Test: Add RisingWave DML verification test (#8317 ) * Test: Verify RisingWave DML operations (INSERT, UPDATE, DELETE) support * Test: Refine RisingWave DML test (remove sleeps, use polling)	1 month ago
Chris Lu	c1a9263e37	Fix STS AssumeRole with POST body param (#8320 ) * Fix STS AssumeRole with POST body param and add integration test * Add STS integration test to CI workflow * Address code review feedback: fix HPP vulnerability and style issues * Refactor: address code review feedback - Fix HTTP Parameter Pollution vulnerability in UnifiedPostHandler - Refactor permission check logic for better readability - Extract test helpers to testutil/docker.go to reduce duplication - Clean up imports and simplify context setting * Add SigV4-style test variant for AssumeRole POST body routing - Added ActionInBodyWithSigV4Style test case to validate real-world scenario - Test confirms routing works correctly for AWS SigV4-signed requests - Addresses code review feedback about testing with SigV4 signatures * Fix: always set identity in context when non-nil - Ensure UnifiedPostHandler always calls SetIdentityInContext when identity is non-nil - Only call SetIdentityNameInContext when identity.Name is non-empty - This ensures downstream handlers (embeddedIam.DoActions) always have access to identity - Addresses potential issue where empty identity.Name would skip context setting	1 month ago
Chris Lu	b8ef48c8f1	Add RisingWave catalog tests (#8308 ) * Add RisingWave catalog tests for S3 tables * Add RisingWave catalog integration tests to CI workflow * Refactor RisingWave catalog tests based on PR feedback * Address PR feedback: optimize checks, cleanup logs * fix tests * consistent	1 month ago
Chris Lu	7151181d54	fix flaky tests	1 month ago
Chris Lu	b57429ef2e	Switch empty-folder cleanup to bucket policy (#8292 ) * Fix Spark _temporary cleanup and add issue #8285 regression test * Generalize empty folder cleanup for Spark temp artifacts * Revert synchronous folder pruning and add cleanup diagnostics * Add actionable empty-folder cleanup diagnostics * Fix Spark temp marker cleanup in async folder cleaner * Fix Spark temp cleanup with implicit directory markers * Keep explicit directory markers non-implicit * logging * more logs * Switch empty-folder cleanup to bucket policy * Seaweed-X-Amz-Allow-Empty-Folders * less logs * go vet * less logs * refactoring	1 month ago
Chris Lu	17f85361e9	Remove unsupported iceberg rest signing-region from tests and docs (#8289 )	1 month ago
Chris Lu	d6825ffce2	Iceberg: implement stage-create finalize flow (phase 1) (#8279 ) * iceberg: implement stage-create and create-on-commit finalize * iceberg: add create validation error typing and stage-create integration test * tests: merge stage-create integration check into catalog suite * tests: cover stage-create finalize lifecycle in catalog integration * iceberg: persist and cleanup stage-create markers * iceberg: add stage-create rollout flag and marker pruning * docs: add stage-create support design and rollout plan * docs: drop stage-create design draft from PR * iceberg: use conservative 72h stage-marker retention * iceberg: address review comments on create-on-commit and tests * iceberg: keep stage-create metadata out of table location * refactor(iceberg): split iceberg.go into focused files	1 month ago
Chris Lu	458c12fb99	test: add Spark S3 integration regression for issue #8234 (#8249 ) * test: add Spark S3 regression integration test for issue 8234 * Update test/s3/spark/setup_test.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	1 month ago
Chris Lu	5a0204310c	Add Iceberg admin UI (#8246 ) * Add Iceberg table details view * Enhance Iceberg catalog browsing UI * Fix Iceberg UI security and logic issues - Fix selectSchema() and partitionFieldsFromFullMetadata() to always search for matching IDs instead of checking != 0 - Fix snapshotsFromFullMetadata() to defensive-copy before sorting to prevent mutating caller's slice - Fix XSS vulnerabilities in s3tables.js: replace innerHTML with textContent/createElement for user-controlled data - Fix deleteIcebergTable() to redirect to namespace tables list on details page instead of reloading - Fix data-bs-target in iceberg_namespaces.templ: remove templ.SafeURL for CSS selector - Add catalogName to delete modal data attributes for proper redirect - Remove unused hidden inputs from create table form (icebergTableBucketArn, icebergTableNamespace) * Regenerate templ files for Iceberg UI updates * Support complex Iceberg type objects in schema Change Type field from string to json.RawMessage in both IcebergSchemaFieldInfo and internal icebergSchemaField to properly handle Iceberg spec's complex type objects (e.g. {"type": "struct", "fields": [...]}). Currently test data only shows primitive string types, but this change makes the implementation defensively robust for future complex types by preserving the exact JSON representation. Add typeToString() helper and update schema extraction functions to marshal string types as JSON. Update template to convert json.RawMessage to string for display. * Regenerate templ files for Type field changes * templ * Fix additional Iceberg UI issues from code review - Fix lazy-load flag that was set before async operation completed, preventing retries on error; now sets loaded flag only after successful load and throws error to caller for proper error handling and UI updates - Add zero-time guards for CreatedAt and ModifiedAt fields in table details to avoid displaying Go zero-time values; render dash when time is zero - Add URL path escaping for all catalog/namespace/table names in URLs to prevent malformed URLs when names contain special characters like /, ?, or # - Remove redundant innerHTML clear in loadIcebergNamespaceTables that cleared twice before appending the table list - Fix selectSnapshotForMetrics to remove != 0 guard for consistency with selectSchema fix; now always searches for CurrentSnapshotID without zero-value gate - Enhance typeToString() helper to display '(complex)' for non-primitive JSON types * Regenerate templ files for Phase 3 updates * Fix template generation to use correct file paths Run templ generate from repo root instead of weed/admin directory to ensure generated _templ.go files have correct absolute paths in error messages (e.g., 'weed/admin/view/app/iceberg_table_details.templ' instead of 'app/iceberg_table_details.templ'). This ensures both 'make admin-generate' at repo root and 'make generate' in weed/admin directory produce identical output with consistent file path references. * Regenerate template files with correct path references * Validate S3 Tables names in UI - Add client-side validation for table bucket and namespace names to surface errors for invalid characters (dots/underscores) before submission - Use HTML validity messages with reportValidity for immediate feedback - Update namespace helper text to reflect actual constraints (single-level, lowercase letters, numbers, and underscores) * Regenerate templ files for namespace helper text * Fix Iceberg catalog REST link and actions * Disallow S3 object access on table buckets * Validate Iceberg layout for table bucket objects * Fix REST API link to /v1/config * merge iceberg page with table bucket page * Allowed Trino/Iceberg stats files in metadata validation * fixes - Backend/data handling: - Normalized Iceberg type display and fallback handling in weed/admin/dash/s3tables_management.go. - Fixed snapshot fallback pointer semantics in weed/admin/dash/s3tables_management.go. - Added CSRF token generation/propagation/validation for namespace create/delete in: - weed/admin/dash/csrf.go - weed/admin/dash/auth_middleware.go - weed/admin/dash/middleware.go - weed/admin/dash/s3tables_management.go - weed/admin/view/layout/layout.templ - weed/admin/static/js/s3tables.js - UI/template fixes: - Zero-time guards for CreatedAt fields in: - weed/admin/view/app/iceberg_namespaces.templ - weed/admin/view/app/iceberg_tables.templ - Fixed invalid templ-in-script interpolation and host/port rendering in: - weed/admin/view/app/iceberg_catalog.templ - weed/admin/view/app/s3tables_buckets.templ - Added data-catalog-name consistency on Iceberg delete action in weed/admin/view/app/iceberg_tables.templ. - Updated retry wording in weed/admin/static/js/s3tables.js. - Regenerated all affected _templ.go files. - S3 API/comment follow-ups: - Reused cached table-bucket validator in weed/s3api/bucket_paths.go. - Added validation-failure debug logging in weed/s3api/s3api_object_handlers_tagging.go. - Added multipart path-validation design comment in weed/s3api/s3api_object_handlers_multipart.go. - Build tooling: - Fixed templ generate working directory issues in weed/admin/Makefile (watch + pattern rule). * populate data * test/s3tables: harden populate service checks * admin: skip table buckets in object-store bucket list * admin sidebar: move object store to top-level links * admin iceberg catalog: guard zero times and escape links * admin forms: add csrf/error handling and client-side name validation * admin s3tables: fix namespace delete modal redeclaration * admin: replace native confirm dialogs with modal helpers * admin modal-alerts: remove noisy confirm usage console log * reduce logs * test/s3tables: use partitioned tables in trino and spark populate * admin file browser: normalize filer ServerAddress for HTTP parsing	1 month ago
Chris Lu	403592bb9f	Add Spark Iceberg catalog integration tests and CI support (#8242 ) * Add Spark Iceberg catalog integration tests and CI support Implement comprehensive integration tests for Spark with SeaweedFS Iceberg REST catalog: - Basic CRUD operations (Create, Read, Update, Delete) on Iceberg tables - Namespace (database) management - Data insertion, querying, and deletion - Time travel capabilities via snapshot versioning - Compatible with SeaweedFS S3 and Iceberg REST endpoints Tests mirror the structure of existing Trino integration tests but use Spark's Python SQL API and PySpark for testing. Add GitHub Actions CI job for spark-iceberg-catalog-tests in s3-tables-tests.yml to automatically run Spark integration tests on pull requests. * fmt * Fix Spark integration tests - code review feedback * go mod tidy * Add go mod tidy step to integration test jobs Add 'go mod tidy' step before test runs for all integration test jobs: - s3-tables-tests - iceberg-catalog-tests - trino-iceberg-catalog-tests - spark-iceberg-catalog-tests This ensures dependencies are clean before running tests. * Fix remaining Spark operations test issues Address final code review comments: Setup & Initialization: - Add waitForSparkReady() helper function that polls Spark readiness with backoff instead of hardcoded 10-second sleep - Extract setupSparkTestEnv() helper to reduce boilerplate duplication between TestSparkCatalogBasicOperations and TestSparkTimeTravel - Both tests now use helpers for consistent, reliable setup Assertions & Validation: - Make setup-critical operations (namespace, table creation, initial insert) use t.Fatalf instead of t.Errorf to fail fast - Validate setupSQL output in TestSparkTimeTravel and fail if not 'Setup complete' - Add validation after second INSERT in TestSparkTimeTravel: verify row count increased to 2 before time travel test - Add context to error messages with namespace and tableName params Code Quality: - Remove code duplication between test functions - All critical paths now properly validated - Consistent error handling throughout * Fix go vet errors in S3 Tables tests Fixes: 1. setup_test.go (Spark): - Add missing import: github.com/testcontainers/testcontainers-go/wait - Use wait.ForLog instead of undefined testcontainers.NewLogStrategy - Remove unused strings import 2. trino_catalog_test.go: - Use net.JoinHostPort instead of fmt.Sprintf for address formatting - Properly handles IPv6 addresses by wrapping them in brackets * Use weed mini for simpler SeaweedFS startup Replace complex multi-process startup (master, volume, filer, s3) with single 'weed mini' command that starts all services together. Benefits: - Simpler, more reliable startup - Single weed mini process vs 4 separate processes - Automatic coordination between components - Better port management with no manual coordination Changes: - Remove separate master, volume, filer process startup - Use weed mini with -master.port, -filer.port, -s3.port flags - Keep Iceberg REST as separate service (still needed) - Increase timeout to 15s for port readiness (weed mini startup) - Remove volumePort and filerProcess fields from TestEnvironment - Simplify cleanup to only handle two processes (mini, iceberg rest) * Clean up dead code and temp directory leaks Fixes: 1. Remove dead s3Process field and cleanup: - weed mini bundles S3 gateway, no separate process needed - Removed s3Process field from TestEnvironment - Removed unnecessary s3Process cleanup code 2. Fix temp config directory leak: - Add sparkConfigDir field to TestEnvironment - Store returned configDir in writeSparkConfig - Clean up sparkConfigDir in Cleanup() with os.RemoveAll - Prevents accumulation of temp directories in test runs 3. Simplify Cleanup: - Now handles only necessary processes (weed mini, iceberg rest) - Removes both seaweedfsDataDir and sparkConfigDir - Cleaner shutdown sequence * Use weed mini's built-in Iceberg REST and fix python binary Changes: - Add -s3.port.iceberg flag to weed mini for built-in Iceberg REST Catalog - Remove separate 'weed server' process for Iceberg REST - Remove icebergRestProcess field from TestEnvironment - Simplify Cleanup() to only manage weed mini + Spark - Add port readiness check for iceberg REST from weed mini - Set Spark container Cmd to '/bin/sh -c sleep 3600' to keep it running - Change python to python3 in container.Exec calls This simplifies to truly one all-in-one weed mini process (master, filer, s3, iceberg-rest) plus just the Spark container. * go fmt * clean up * bind on a non-loopback IP for container access, aligned Iceberg metadata saves/locations with table locations, and reworked Spark time travel to use TIMESTAMP AS OF with safe timestamp extraction. * shared mini start * Fixed internal directory creation under /buckets so .objects paths can auto-create without failing bucket-name validation, which restores table bucket object writes * fix path Updated table bucket objects to write under `/buckets/<bucket>` and saved Iceberg metadata there, adjusting Spark time-travel timestamp to committed_at +1s. Rebuilt the weed binary (`go install ./weed`) and confirmed passing tests for Spark and Trino with focused test commands. * Updated table bucket creation to stop creating /buckets/.objects and switched Trino REST warehouse to s3://<bucket> to match Iceberg layout. * Stabilize S3Tables integration tests * Fix timestamp extraction and remove dead code in bucketDir * Use table bucket as warehouse in s3tables tests * Update trino_blog_operations_test.go * adds the CASCADE option to handle any remaining table metadata/files in the schema directory * skip namespace not empty	1 month ago
Chris Lu	e6ee293c17	Add table operations test (#8241 ) * Add Trino blog operations test * Update test/s3tables/catalog_trino/trino_blog_operations_test.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * feat: add table bucket path helpers and filer operations - Add table object root and table location mapping directories - Implement ensureDirectory, upsertFile, deleteEntryIfExists helpers - Support table location bucket mapping for S3 access * feat: manage table bucket object roots on creation/deletion - Create .objects directory for table buckets on creation - Clean up table object bucket paths on deletion - Enable S3 operations on table bucket object roots * feat: add table location mapping for Iceberg REST - Track table location bucket mappings when tables are created/updated/deleted - Enable location-based routing for S3 operations on table data * feat: route S3 operations to table bucket object roots - Route table-s3 bucket names to mapped table paths - Route table buckets to object root directories - Support table location bucket mapping lookup * feat: emit table-s3 locations from Iceberg REST - Generate unique table-s3 bucket names with UUID suffix - Store table metadata under table bucket paths - Return table-s3 locations for Trino compatibility * fix: handle missing directories in S3 list operations - Propagate ErrNotFound from ListEntries for non-existent directories - Treat missing directories as empty results for list operations - Fixes Trino non-empty location checks on table creation * test: improve Trino CSV parsing for single-value results - Sanitize Trino output to skip jline warnings - Handle single-value CSV results without header rows - Strip quotes from numeric values in tests * refactor: use bucket path helpers throughout S3 API - Replace direct bucket path operations with helper functions - Leverage centralized table bucket routing logic - Improve maintainability with consistent path resolution * fix: add table bucket cache and improve filer error handling - Cache table bucket lookups to reduce filer overhead on repeated checks - Use filer_pb.CreateEntry and filer_pb.UpdateEntry helpers to check resp.Error - Fix delete order in handler_bucket_get_list_delete: delete table object before directory - Make location mapping errors best-effort: log and continue, don't fail API - Update table location mappings to delete stale prior bucket mappings on update - Add 1-second sleep before timestamp time travel query to ensure timestamps are in past - Fix CSV parsing: examine all lines, not skip first; handle single-value rows * fix: properly handle stale metadata location mapping cleanup - Capture oldMetadataLocation before mutation in handleUpdateTable - Update updateTableLocationMapping to accept both old and new locations - Use passed-in oldMetadataLocation to detect location changes - Delete stale mapping only when location actually changes - Pass empty string for oldLocation in handleCreateTable (new tables have no prior mapping) - Improve logging to show old -> new location transitions * refactor: cleanup imports and cache design - Remove unused 'sync' import from bucket_paths.go - Use filer_pb.UpdateEntry helper in setExtendedAttribute and deleteExtendedAttribute for consistent error handling - Add dedicated tableBucketCache map[string]bool to BucketRegistry instead of mixing concerns with metadataCache - Improve cache separation: table buckets cache is now separate from bucket metadata cache * fix: improve cache invalidation and add transient error handling Cache invalidation (critical fix): - Add tableLocationCache to BucketRegistry for location mapping lookups - Clear tableBucketCache and tableLocationCache in RemoveBucketMetadata - Prevents stale cache entries when buckets are deleted/recreated Transient error handling: - Only cache table bucket lookups when conclusive (found or ErrNotFound) - Skip caching on transient errors (network, permission, etc) - Prevents marking real table buckets as non-table due to transient failures Performance optimization: - Cache tableLocationDir results to avoid repeated filer RPCs on hot paths - tableLocationDir now checks cache before making expensive filer lookups - Cache stores empty string for 'not found' to avoid redundant lookups Code clarity: - Add comment to deleteDirectory explaining DeleteEntry response lacks Error field * go fmt * fix: mirror transient error handling in tableLocationDir and optimize bucketDir Transient error handling: - tableLocationDir now only caches definitive results - Mirrors isTableBucket behavior to prevent treating transient errors as permanent misses - Improves reliability on flaky systems or during recovery Performance optimization: - bucketDir avoids redundant isTableBucket call via bucketRoot - Directly use s3a.option.BucketsPath for regular buckets - Saves one cache lookup for every non-table bucket operation * fix: revert bucketDir optimization to preserve bucketRoot logic The optimization to directly use BucketsPath bypassed bucketRoot's logic and caused issues with S3 list operations on delimiter+prefix cases. Revert to using path.Join(s3a.bucketRoot(bucket), bucket) which properly handles all bucket types and ensures consistent path resolution across the codebase. The slight performance cost of an extra cache lookup is worth the correctness and consistency benefits. * feat: move table buckets under /buckets Add a table-bucket marker attribute, reuse bucket metadata cache for table bucket detection, and update list/validation/UI/test paths to treat table buckets as /buckets entries. * Fix S3 Tables code review issues - handler_bucket_create.go: Fix bucket existence check to properly validate entryResp.Entry before setting s3BucketExists flag (nil Entry should not indicate existing bucket) - bucket_paths.go: Add clarifying comment to bucketRoot() explaining unified buckets root path for all bucket types - file_browser_data.go: Optimize by extracting table bucket check early to avoid redundant WithFilerClient call * Fix list prefix delimiter handling * Handle list errors conservatively * Fix Trino FOR TIMESTAMP query - use past timestamp Iceberg requires the timestamp to be strictly in the past. Use current_timestamp - interval '1' second instead of current_timestamp. --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	1 month ago

1 2 3 4 5

226 Commits (61befd10fccf3ab050d49c0fbb52a2f8fe4e656d)