Chris Lu
49a64f50f1
Add session policy support to IAM ( #8338 )
* Add session policy support to IAM
- Implement policy evaluation for session tokens in policy_engine.go
- Add session_policy field to session claims for tracking applied policies
- Update STS service to include session policies in token generation
- Add IAM integration tests for session policy validation
- Update IAM manager to support policy attachment to sessions
- Extend S3 API STS endpoint to handle session policy restrictions
* fix: optimize session policy evaluation and add documentation
* sts: add NormalizeSessionPolicy helper for inline session policies
* sts: support inline session policies for AssumeRoleWithWebIdentity and credential-based flows
* s3api: parse and normalize Policy parameter for STS HTTP handlers
* tests: add session policy unit tests and integration tests for inline policy downscoping
* tests: add s3tables STS inline policy integration
* iam: handle user principals and validate tokens
* sts: enforce inline session policy size limit
* tests: harden s3tables STS integration config
* iam: clarify principal policy resolution errors
* tests: improve STS integration endpoint selection
1 week ago
Chris Lu
beeb375a88
Add volume server integration test suite and CI workflow ( #8322 )
* docs(volume_server): add integration test development plan
* test(volume_server): add integration harness and profile matrix
* test(volume_server/http): add admin and options integration coverage
* test(volume_server/grpc): add state and status integration coverage
* test(volume_server): auto-build weed binary and harden cluster startup
* test(volume_server/http): add upload read range head delete coverage
* test(volume_server/grpc): expand admin lifecycle and state coverage
* docs(volume_server): update progress tracker for implemented tests
* test(volume_server/http): cover if-none-match and invalid-range branches
* test(volume_server/grpc): add batch delete integration coverage
* docs(volume_server): log latest HTTP and gRPC test coverage
* ci(volume_server): run volume server integration tests in github actions
* test(volume_server/grpc): add needle status configure ping and leave coverage
* docs(volume_server): record additional grpc coverage progress
* test(volume_server/grpc): add vacuum integration coverage
* docs(volume_server): record vacuum test coverage progress
* test(volume_server/grpc): add read and write needle blob error-path coverage
* docs(volume_server): record data rw grpc coverage progress
* test(volume_server/http): add jwt auth integration coverage
* test(volume_server/grpc): add sync copy and stream error-path coverage
* docs(volume_server): record jwt and sync/copy test coverage
* test(volume_server/grpc): add scrub and query integration coverage
* test(volume_server/grpc): add volume tail sender and receiver coverage
* docs(volume_server): record scrub query and tail test progress
* test(volume_server/grpc): add readonly writable and collection lifecycle coverage
* test(volume_server/http): add public-port cors and method parity coverage
* test(volume_server/grpc): add blob meta and read-all success path coverage
* test(volume_server/grpc): expand scrub and query variation coverage
* test(volume_server/grpc): add tiering and remote fetch error-path coverage
* test(volume_server/http): add unchanged write and delete edge-case coverage
* test(volume_server/grpc): add ping unknown and unreachable target coverage
* test(volume_server/grpc): add volume delete only-empty variation coverage
* test(volume_server/http): add jwt fid-mismatch auth coverage
* test(volume_server/grpc): add scrub ec auto-select empty coverage
* test(volume_server/grpc): stabilize ping timestamp assertion
* docs(volume_server): update integration coverage progress log
* test(volume_server/grpc): add tier remote backend and config variation coverage
* docs(volume_server): record tier remote variation progress
* test(volume_server/grpc): add incremental copy and receive-file protocol coverage
* test(volume_server/http): add read path shape and if-modified-since coverage
* test(volume_server/grpc): add copy-file compaction and receive-file success coverage
* test(volume_server/http): add passthrough headers and static asset coverage
* test(volume_server/grpc): add ping filer unreachable coverage
* docs(volume_server): record copy receive and http variant progress
* test(volume_server/grpc): add erasure coding maintenance and missing-path coverage
* docs(volume_server): record initial erasure coding rpc coverage
* test(volume_server/http): add multi-range multipart response coverage
* docs(volume_server): record multi-range http coverage progress
* test(volume_server/grpc): add query empty-stripe no-match coverage
* docs(volume_server): record query no-match stream behavior coverage
* test(volume_server/http): add upload throttling timeout and replicate bypass coverage
* docs(volume_server): record upload throttling coverage progress
* test(volume_server/http): add download throttling timeout coverage
* docs(volume_server): record download throttling coverage progress
* test(volume_server/http): add jwt wrong-cookie fid mismatch coverage
* docs(volume_server): record jwt wrong-cookie mismatch coverage
* test(volume_server/http): add jwt expired-token rejection coverage
* docs(volume_server): record jwt expired-token coverage
* test(volume_server/http): add jwt query and cookie transport coverage
* docs(volume_server): record jwt token transport coverage
* test(volume_server/http): add jwt token-source precedence coverage
* docs(volume_server): record jwt token-source precedence coverage
* test(volume_server/http): add jwt header-over-cookie precedence coverage
* docs(volume_server): record jwt header cookie precedence coverage
* test(volume_server/http): add jwt query-over-cookie precedence coverage
* docs(volume_server): record jwt query cookie precedence coverage
* test(volume_server/grpc): add setstate version mismatch and nil-state coverage
* docs(volume_server): record setstate validation coverage
* test(volume_server/grpc): add readonly persist-true lifecycle coverage
* docs(volume_server): record readonly persist variation coverage
* test(volume_server/http): add options origin cors header coverage
* docs(volume_server): record options origin cors coverage
* test(volume_server/http): add trace unsupported-method parity coverage
* docs(volume_server): record trace method parity coverage
* test(volume_server/grpc): add batch delete cookie-check variation coverage
* docs(volume_server): record batch delete cookie-check coverage
* test(volume_server/grpc): add admin lifecycle missing and maintenance variants
* docs(volume_server): record admin lifecycle edge-case coverage
* test(volume_server/grpc): add mixed batch delete status matrix coverage
* docs(volume_server): record mixed batch delete matrix coverage
* test(volume_server/http): add jwt-profile ui access gating coverage
* docs(volume_server): record jwt ui-gating http coverage
* test(volume_server/http): add propfind unsupported-method parity coverage
* docs(volume_server): record propfind method parity coverage
* test(volume_server/grpc): add volume configure success and rollback-path coverage
* docs(volume_server): record volume configure branch coverage
* test(volume_server/grpc): add volume needle status missing-path coverage
* docs(volume_server): record volume needle status error-path coverage
* test(volume_server/http): add readDeleted query behavior coverage
* docs(volume_server): record readDeleted http behavior coverage
* test(volume_server/http): add delete ts override parity coverage
* docs(volume_server): record delete ts parity coverage
* test(volume_server/grpc): add invalid blob/meta offset coverage
* docs(volume_server): record invalid blob/meta offset coverage
* test(volume_server/grpc): add read-all mixed volume abort coverage
* docs(volume_server): record read-all mixed-volume abort coverage
* test(volume_server/http): assert head response body parity
* docs(volume_server): record head body parity assertion
* test(volume_server/grpc): assert status state and memory payload completeness
* docs(volume_server): record volume server status payload coverage
* test(volume_server/grpc): add batch delete chunk-manifest rejection coverage
* docs(volume_server): record batch delete chunk-manifest coverage
* test(volume_server/grpc): add query cookie-mismatch eof parity coverage
* docs(volume_server): record query cookie-mismatch parity coverage
* test(volume_server/grpc): add ping master success target coverage
* docs(volume_server): record ping master success coverage
* test(volume_server/http): add head if-none-match conditional parity
* docs(volume_server): record head if-none-match parity coverage
* test(volume_server/http): add head if-modified-since parity coverage
* docs(volume_server): record head if-modified-since parity coverage
* test(volume_server/http): add connect unsupported-method parity coverage
* docs(volume_server): record connect method parity coverage
* test(volume_server/http): assert options allow-headers cors parity
* docs(volume_server): record options allow-headers coverage
* test(volume_server/framework): add dual volume cluster integration harness
* test(volume_server/http): add missing-local read mode proxy redirect local coverage
* docs(volume_server): record read mode missing-local matrix coverage
* test(volume_server/http): add download over-limit replica proxy fallback coverage
* docs(volume_server): record download replica fallback coverage
* test(volume_server/http): add missing-local readDeleted proxy redirect parity coverage
* docs(volume_server): record missing-local readDeleted mode coverage
* test(volume_server/framework): add single-volume cluster with filer harness
* test(volume_server/grpc): add ping filer success target coverage
* docs(volume_server): record ping filer success coverage
* test(volume_server/http): add proxied-loop guard download timeout coverage
* docs(volume_server): record proxied-loop download coverage
* test(volume_server/http): add disabled upload and download limit coverage
* docs(volume_server): record disabled throttling path coverage
* test(volume_server/grpc): add idempotent volume server leave coverage
* docs(volume_server): record leave idempotence coverage
* test(volume_server/http): add redirect collection query preservation coverage
* docs(volume_server): record redirect collection query coverage
* test(volume_server/http): assert admin server headers on status and health
* docs(volume_server): record admin server header coverage
* test(volume_server/http): assert healthz request-id echo parity
* docs(volume_server): record healthz request-id parity coverage
* test(volume_server/http): add over-limit invalid-vid download branch coverage
* docs(volume_server): record over-limit invalid-vid branch coverage
* test(volume_server/http): add public-port static asset coverage
* docs(volume_server): record public static endpoint coverage
* test(volume_server/http): add public head method parity coverage
* docs(volume_server): record public head parity coverage
* test(volume_server/http): add throttling wait-then-proceed path coverage
* docs(volume_server): record throttling wait-then-proceed coverage
* test(volume_server/http): add read cookie-mismatch not-found coverage
* docs(volume_server): record read cookie-mismatch coverage
* test(volume_server/http): add throttling timeout-recovery coverage
* docs(volume_server): record throttling timeout-recovery coverage
* test(volume_server/grpc): add ec generate mount info unmount lifecycle coverage
* docs(volume_server): record ec positive lifecycle coverage
* test(volume_server/grpc): add ec shard read and blob delete lifecycle coverage
* docs(volume_server): record ec shard read/blob delete lifecycle coverage
* test(volume_server/grpc): add ec rebuild and to-volume error branch coverage
* docs(volume_server): record ec rebuild and to-volume branch coverage
* test(volume_server/grpc): add ec shards-to-volume success roundtrip coverage
* docs(volume_server): record ec shards-to-volume success coverage
* test(volume_server/grpc): add ec receive and copy-file missing-source coverage
* docs(volume_server): record ec receive and copy-file coverage
* test(volume_server/grpc): add ec last-shard delete cleanup coverage
* docs(volume_server): record ec last-shard delete cleanup coverage
* test(volume_server/grpc): add volume copy success path coverage
* docs(volume_server): record volume copy success coverage
* test(volume_server/grpc): add volume copy overwrite-destination coverage
* docs(volume_server): record volume copy overwrite coverage
* test(volume_server/http): add write error-path variant coverage
* docs(volume_server): record http write error-path coverage
* test(volume_server/http): add conditional header precedence coverage
* docs(volume_server): record conditional header precedence coverage
* test(volume_server/http): add oversized combined range guard coverage
* docs(volume_server): record oversized range guard coverage
* test(volume_server/http): add image resize and crop read coverage
* docs(volume_server): record image transform coverage
* test(volume_server/http): add chunk-manifest expansion and bypass coverage
* docs(volume_server): record chunk-manifest read coverage
* test(volume_server/http): add compressed read encoding matrix coverage
* docs(volume_server): record compressed read matrix coverage
* test(volume_server/grpc): add tail receiver source replication coverage
* docs(volume_server): record tail receiver replication coverage
* test(volume_server/grpc): add tail sender large-needle chunking coverage
* docs(volume_server): record tail sender chunking coverage
* test(volume_server/grpc): add ec-backed volume needle status coverage
* docs(volume_server): record ec-backed needle status coverage
* test(volume_server/grpc): add ec shard copy from peer success coverage
* docs(volume_server): record ec shard copy success coverage
* test(volume_server/http): add chunk-manifest delete child cleanup coverage
* docs(volume_server): record chunk-manifest delete cleanup coverage
* test(volume_server/http): add chunk-manifest delete failure-path coverage
* docs(volume_server): record chunk-manifest delete failure coverage
* test(volume_server/grpc): add ec shard copy source-unavailable coverage
* docs(volume_server): record ec shard copy source-unavailable coverage
* parallel
1 week ago
Chris Lu
c433fee36a
s3api: fix AccessDenied by correctly propagating principal ARN in vended tokens ( #8330 )
* s3api: fix AccessDenied by correctly propagating principal ARN in vended tokens
* s3api: update TestLoadS3ApiConfiguration to match standardized ARN format
* s3api: address PR review comments (nil-safety and cleanup)
* s3api: address second round of PR review comments (cleanups and naming conventions)
* s3api: address third round of PR review comments (unify default account ID and duplicate log)
* s3api: address fourth round of PR review comments (define defaultAccountID as constant)
1 week ago
Chris Lu
1e4f30c56f
pb: fix IPv6 double brackets in ServerAddress formatting ( #8329 )
* pb: fix IPv6 double brackets in ServerAddress formatting
* pb: refactor IPv6 tests into table-driven test
* util: add JoinHostPortStr and use it in pb to avoid unsafe port parsing
1 week ago
Chris Lu
796f23f68a
Fix STS InvalidAccessKeyId and request body consumption issues ( #8328 )
* Fix STS InvalidAccessKeyId and request body consumption in Lakekeeper integration test
* Remove debug prints
* Add Lakekeeper integration tests to CI
* Fix connection refused in CI by binding to 0.0.0.0
* Add timeout to docker run in Lakekeeper integration test
* Update weed/s3api/auth_credentials.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 week ago
FivegenLLC
951eeefb76
fix(s3): lifecycle TTL rules inherit replication and volumeGrowthCount from filer config ( #8321 )
* fix(s3): lifecycle TTL rules inherit replication from parent path and filer config
PutBucketLifecycleConfiguration wrote filer.conf entries with empty replication,
so effective replication could differ from operator default. Now we resolve
replication from parent path rule (MatchStorageRule) then filer global config;
only Replication is set on the rule (no DataCenter/Rack/DataNode for S3).
* add volumeGrowthCount
* review
---------
Co-authored-by: Dmitiy Gushchin <dag@fivegen.ru>
1 week ago
Chris Lu
25ea48227f
Fix STS temporary credentials to use ASIA prefix instead of AKIA ( #8326 )
Temporary credentials from STS AssumeRole were using "AKIA" prefix
(permanent IAM user credentials) instead of "ASIA" prefix (temporary
security credentials). This violates AWS conventions and may cause
compatibility issues with AWS SDKs that validate credential types.
Changes:
- Rename generateAccessKeyId to generateTemporaryAccessKeyId for clarity
- Update function to use ASIA prefix for temporary credentials
- Add unit tests to verify ASIA prefix format (weed/iam/sts/credential_prefix_test.go)
- Add integration test to verify ASIA prefix in S3 API (test/s3/iam/s3_sts_credential_prefix_test.go)
- Ensure AWS-compatible credential format (ASIA + 16 hex chars)
The credentials are already deterministic (SHA256-based from session ID)
and the SessionToken is correctly set to the JWT token, so this is just
a prefix fix to follow AWS standards.
Fixes #8312
1 week ago
Chris Lu
0082c47e04
Test: Add RisingWave DML verification test ( #8317 )
* Test: Verify RisingWave DML operations (INSERT, UPDATE, DELETE) support
* Test: Refine RisingWave DML test (remove sleeps, use polling)
1 week ago
Lukas
abd681b54b
Fix service name in the worker deployment (seaweedfs#8314) ( #8315 )
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
1 week ago
Chris Lu
4e1065e485
Fix: preserve request body for STS signature verification ( #8324 )
* Fix: preserve request body for STS signature verification
- Save and restore request body in UnifiedPostHandler after ParseForm()
- This allows STS handler to verify signatures correctly
- Fixes 'invalid AWS signature: 53' error (ErrContentSHA256Mismatch)
- ParseForm() consumes the body, so we need to restore it for downstream handlers
* Improve error handling in UnifiedPostHandler
- Add http.MaxBytesReader to limit body size to 10 MiB (iamRequestBodyLimit)
- Add proper error handling for io.ReadAll failures
- Log errors when body reading fails
- Prevents DoS attacks from oversized request bodies
- Addresses code review feedback
1 week ago
Chris Lu
c1a9263e37
Fix STS AssumeRole with POST body param ( #8320 )
* Fix STS AssumeRole with POST body param and add integration test
* Add STS integration test to CI workflow
* Address code review feedback: fix HPP vulnerability and style issues
* Refactor: address code review feedback
- Fix HTTP Parameter Pollution vulnerability in UnifiedPostHandler
- Refactor permission check logic for better readability
- Extract test helpers to testutil/docker.go to reduce duplication
- Clean up imports and simplify context setting
* Add SigV4-style test variant for AssumeRole POST body routing
- Added ActionInBodyWithSigV4Style test case to validate real-world scenario
- Test confirms routing works correctly for AWS SigV4-signed requests
- Addresses code review feedback about testing with SigV4 signatures
* Fix: always set identity in context when non-nil
- Ensure UnifiedPostHandler always calls SetIdentityInContext when identity is non-nil
- Only call SetIdentityNameInContext when identity.Name is non-empty
- This ensures downstream handlers (embeddedIam.DoActions) always have access to identity
- Addresses potential issue where empty identity.Name would skip context setting
1 week ago
Chris Lu
6bd6bba594
Fix inconsistent admin argument in worker pods ( #8316 )
* Fix inconsistent admin argument in worker pods
* Use seaweedfs.componentName for admin service naming
1 week ago
Chris Lu
b8ef48c8f1
Add RisingWave catalog tests ( #8308 )
* Add RisingWave catalog tests for S3 tables
* Add RisingWave catalog integration tests to CI workflow
* Refactor RisingWave catalog tests based on PR feedback
* Address PR feedback: optimize checks, cleanup logs
* fix tests
* consistent
1 week ago
Chris Lu
75faf826d4
Fix LevelDB panic on lazy reload ( #8269 ) ( #8307 )
* fix LevelDB panic on lazy reload
Implemented a thread-safe reload mechanism using double-checked
locking and a retry loop in Get, Put, and Delete. Added a concurrency
test to verify the fix and prevent regressions.
Fixes #8269
* refactor: use helper for leveldb fix and remove deprecated ioutil
* fix: prevent deadlock by using getFromDb helper
Extracted DB lookup to internal helper to avoid recursive RLock in Put/Delete methods.
Updated Get to use the helper as well.
* fix: resolve syntax error and commit deadlock prevention
Fixed a duplicate function declaration syntax error.
Verified that getFromDb helper correctly prevents recursive RLock scenarios.
* refactor: remove redundant timeout checks
Removed nested `if m.ldbTimeout > 0` checks in Get, Put, and Delete
methods as suggested in PR review.
1 week ago
Lisandro Pin
221bd237c4
Fix file stat collection metric bug for the `cluster.status` command. ( #8302 )
When the `--files` flag is present, `cluster.status` will scrape file metrics
from volume servers to provide detailed stats on those. The progress indicator
was not being updated properly though, so the command would complete before
it read 100%.
1 week ago
Chris Lu
a3136c523f
Fix volume.fsck 401 Unauthorized by adding JWT to HTTP delete requests ( #8306 )
* Fix volume.fsck 401 Unauthorized by adding JWT to HTTP delete requests
* Additionally, for performance, consider fetching the jwt.filer_signing.key once before any loops that call httpDelete, rather than inside httpDelete itself, to avoid repeated configuration lookups.
1 week ago
Chris Lu
ac242d04ee
one time manual run
1 week ago
Chris Lu
21543134c8
fix manual build process
1 week ago
Chris Lu
8b5d31e5eb
s3api/policy_engine: use forwarded client IP for aws:SourceIp ( #8304 )
* s3api: honor forwarded source IP for policy conditions
Prefer X-Forwarded-For/X-Real-Ip before RemoteAddr when populating aws:SourceIp in policy condition evaluation. Also avoid noisy parsing behavior for unix socket markers and add coverage for precedence/fallback paths.\n\nFixes #8301 .
* s3api: simplify remote addr parsing
* s3api: guard aws:SourceIp against DNS hosts
* s3api: simplify remote addr fallback
* s3api: simplify remote addr parsing
* Update weed/s3api/policy_engine/engine.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Fix TestExtractConditionValuesFromRequestSourceIPPrecedence using trusted private IP
* Refactor extractSourceIP to use R-to-L XFF parsing and net.IP.IsPrivate
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 week ago
Chris Lu
7151181d54
fix flaky tests
1 week ago
Lisandro Pin
e657e7d827
Implement local scrubbing for EC volumes. ( #8283 )
1 week ago
Lisandro Pin
2a73219397
Add weed shell command `volumeServer.state` to query/update volume server state settings. ( #8271 )
Add weed shell command `volumeServer.state` to query/update volume server states.
1 week ago
Chris Lu
7fcbffed7f
filer.sync: support manifest chunks ( #8299 )
* filer.sync support manifest chunks
* filersink: address manifest sync review feedback
1 week ago
Chris Lu
be0379f6fd
Fix filer.sync retry on stale chunk ( #8298 )
* Fix filer.sync stale chunk uploads
* Tweak filersink stale logging
1 week ago
Chris Lu
b57429ef2e
Switch empty-folder cleanup to bucket policy ( #8292 )
* Fix Spark _temporary cleanup and add issue #8285 regression test
* Generalize empty folder cleanup for Spark temp artifacts
* Revert synchronous folder pruning and add cleanup diagnostics
* Add actionable empty-folder cleanup diagnostics
* Fix Spark temp marker cleanup in async folder cleaner
* Fix Spark temp cleanup with implicit directory markers
* Keep explicit directory markers non-implicit
* logging
* more logs
* Switch empty-folder cleanup to bucket policy
* Seaweed-X-Amz-Allow-Empty-Folders
* less logs
* go vet
* less logs
* refactoring
1 week ago
Chris Lu
5c365e7090
s3api: return 400 for invalid namespace query in REST table routes ( #8296 )
* s3api: reject invalid namespace query in REST table routes
* s3api: expand namespace validation REST tests
1 week ago
Chris Lu
822dbed552
s3api: fix ListObjectsV2 NextContinuationToken duplication for nested prefix ( #8294 )
* s3api: fix duplicate ListObjectsV2 continuation token for nested prefix
* s3api: include prefix in common-prefix continuation token
1 week ago
Chris Lu
2d97685390
ci: fix container_release_unified manual dispatch and workflow parsing ( #8293 )
ci: fix unified release workflow dispatch matrix filtering
1 week ago
Chris Lu
1b2f719d7c
admin: fix file browser items-per-page selector ( #8291 )
* admin: fix file browser page size selector
Fix file browser pagination page-size selectors to use explicit select IDs instead of this.value in templ-generated handlers, which could resolve to undefined and produce limit=undefined in requests.
Add a focused template render regression test to prevent this from recurring.
Fixes #8284
* revert file browser template regression test
1 week ago
Chris Lu
b73bd08470
ci: move manual container builds to unified release workflow ( #8290 )
* ci: move manual dev container build into unified release workflow
* ci: make unified manual container build release-tag based
1 week ago
Chris Lu
17f85361e9
Remove unsupported iceberg rest signing-region from tests and docs ( #8289 )
1 week ago
Chris Lu
b261c89675
Fix RocksDB container build compatibility and add manual rocksdb dispatch ( #8288 )
* Fix RocksDB build compatibility and add manual rocksdb trigger
* Upgrade RocksDB defaults and keep grocksdb v1.10.7
* Add manual latest-image trigger inputs for ref and variant
* Allow manual latest build to set image tag and source ref
* Fix manual variant selection using setup job matrix output
1 week ago
Chris Lu
0385acba02
s3tables: fix shared table-location bucket mapping collisions ( #8286 )
* s3tables: prevent shared table-location bucket mapping overwrite
* Update weed/s3api/bucket_paths.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
---------
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
1 week ago
Chris Lu
d6825ffce2
Iceberg: implement stage-create finalize flow (phase 1) ( #8279 )
* iceberg: implement stage-create and create-on-commit finalize
* iceberg: add create validation error typing and stage-create integration test
* tests: merge stage-create integration check into catalog suite
* tests: cover stage-create finalize lifecycle in catalog integration
* iceberg: persist and cleanup stage-create markers
* iceberg: add stage-create rollout flag and marker pruning
* docs: add stage-create support design and rollout plan
* docs: drop stage-create design draft from PR
* iceberg: use conservative 72h stage-marker retention
* iceberg: address review comments on create-on-commit and tests
* iceberg: keep stage-create metadata out of table location
* refactor(iceberg): split iceberg.go into focused files
1 week ago
Chris Lu
d88f6ed0af
Iceberg commit reliability: preserve statistics updates and return 409 conflicts ( #8277 )
* iceberg: harden table commit updates and conflict handling
* iceberg: refine commit retry and statistics patching
* iceberg: cleanup metadata on non-conflict commit errors
2 weeks ago
Chris Lu
5ae3be44d1
iceberg: persist namespace properties for create/get ( #8276 )
* iceberg: persist namespace properties via s3tables metadata
* iceberg: simplify namespace properties normalization
* s3tables: broaden namespace properties round-trip test
* adjust logs
* adjust logs
2 weeks ago
Chris Lu
1c62808c0e
iceberg: wire pagination for list namespaces/tables REST APIs ( #8275 )
* s3api/iceberg: wire list pagination tokens and page size
* fmt
* Update weed/s3api/iceberg/iceberg.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
---------
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2 weeks ago
Chris Lu
db76eb26e7
compile
2 weeks ago
Chris Lu
4ccc7668ce
admin: resolve merge conflicts
2 weeks ago
Chris Lu
aef2de3109
s3tables: support multi-level namespaces in parser/admin paths ( #8273 )
* s3tables: support multi-level namespace normalization
* admin: handle namespace parsing errors centrally
* admin: clean namespace validation duplication
2 weeks ago
Chris Lu
be26ce74ce
s3tables: support multi-level namespace normalization
2 weeks ago
Chris Lu
0b80f055c2
Merge branch 'fix/8270-leader-not-elected'
2 weeks ago
Chris Lu
af8273386d
4.12
2 weeks ago
Chris Lu
ba8e2aaae9
Fix master leader election when grpc ports change ( #8272 )
* Fix master leader detection when grpc ports change
* Canonicalize self peer entry to avoid raft self-alias panic
* Normalize and deduplicate master peer addresses
2 weeks ago
Chris Lu
15d0a46679
Normalize and deduplicate master peer addresses
2 weeks ago
Chris Lu
ae27e17e6f
Canonicalize self peer entry to avoid raft self-alias panic
2 weeks ago
Chris Lu
02dac23119
Fix master leader detection when grpc ports change
2 weeks ago
Lisandro Pin
f400fb44a0
Update `cluster.status` to resolve file details on EC volumes. ( #8268 )
Also parallelizes queries for file metrics collections when the `--files`
flag is specified, and improves the command's output for readability:
```
> cluster.status --files
collecting file stats: 100%
cluster:
id: topo
status: LOCKED
nodes: 10
topology: 1 DC, 10 disks on 1 rack
volumes:
total: 3 volumes, 1 collection
max size: 32 GB
regular: 1/80 volume on 3 replicas, 3 writable (100%), 0 read-only (0%)
EC: 2 EC volumes on 28 shards (14 shards/volume)
storage:
total: 269 MB (522 MB raw, 193.95%)
regular volumes: 91 MB (272 MB raw, 300%)
EC volumes: 178 MB (250 MB raw, 140%)
files:
total: 363 files, 300 readable (82.64%), 63 deleted (17.35%), avg 522 kB per file
regular: 168 files, 105 readable (62.5%), 63 deleted (37.5%), avg 540 kB per file
EC: 195 files, 195 readable (100%), 0 deleted (0%), avg 506 kB per file
```
2 weeks ago
Chris Lu
30812b85f3
fix ec.encode skipping volumes when one replica is on a full disk ( #8227 )
* fix ec.encode skipping volumes when one replica is on a full disk
This fixes issue #8218 . Previously, ec.encode would skip a volume if ANY
of its replicas resided on a disk with low free volume count. Now it
accepts the volume if AT LEAST ONE replica is on a healthy disk.
* refine noFreeDisk counter logic in ec.encode
Ensure noFreeDisk is decremented if a volume initially marked as bad
is later found to have a healthy replica. This ensures accurate
summary statistics.
* defer noFreeDisk counting and refine logging in ec.encode
Updated logging to be replica-scoped and deferred noFreeDisk counting to
the final pass over vidMap. This ensures that the counter only reflects
volumes that are definitively excluded because all replicas are on full
disks.
* filter replicas by free space during ec.encode
Updated doEcEncode to filter out replicas on disks with
FreeVolumeCount < 2 before selecting the best replica for encoding.
This ensures that EC shards are not generated on healthy source
replicas that happen to be on disks with low free space.
2 weeks ago
Chris Lu
6a61037333
fix issue #8230 : volume.fsck deletion logic to respect purgeAbsent flag ( #8266 )
* fix issue #8230 : volume.fsck deletion logic to respect purgeAbsent flag
This commit fixes two issues in volume.fsck:
1. Missing chunks in existing volumes are now deleted if -reallyDeleteFilerEntries is set.
2. Missing volumes are now properly handled when a -volumeId filter is specified, allowing deletion of filer entries for those volumes.
* address PR feedback for issue #8230
- Ensure volume filter is applied before reporting missing volumes
- Fix potential nil-pointer dereferences in httpDelete method
- Use proper error checking throughout httpDelete
* address second round PR feedback for issue #8230
- Use fmt.Fprintf(c.writer, ...) instead of fmt.Printf
- Add missing newline in "deleting path" log message
2 weeks ago