* Fix STS InvalidAccessKeyId and request body consumption in Lakekeeper integration test
* Remove debug prints
* Add Lakekeeper integration tests to CI
* Fix connection refused in CI by binding to 0.0.0.0
* Add timeout to docker run in Lakekeeper integration test
* Update weed/s3api/auth_credentials.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix(s3): lifecycle TTL rules inherit replication from parent path and filer config
PutBucketLifecycleConfiguration wrote filer.conf entries with empty replication,
so effective replication could differ from operator default. Now we resolve
replication from parent path rule (MatchStorageRule) then filer global config;
only Replication is set on the rule (no DataCenter/Rack/DataNode for S3).
* add volumeGrowthCount
* review
---------
Co-authored-by: Dmitiy Gushchin <dag@fivegen.ru>
Temporary credentials from STS AssumeRole were using "AKIA" prefix
(permanent IAM user credentials) instead of "ASIA" prefix (temporary
security credentials). This violates AWS conventions and may cause
compatibility issues with AWS SDKs that validate credential types.
Changes:
- Rename generateAccessKeyId to generateTemporaryAccessKeyId for clarity
- Update function to use ASIA prefix for temporary credentials
- Add unit tests to verify ASIA prefix format (weed/iam/sts/credential_prefix_test.go)
- Add integration test to verify ASIA prefix in S3 API (test/s3/iam/s3_sts_credential_prefix_test.go)
- Ensure AWS-compatible credential format (ASIA + 16 hex chars)
The credentials are already deterministic (SHA256-based from session ID)
and the SessionToken is correctly set to the JWT token, so this is just
a prefix fix to follow AWS standards.
Fixes#8312
* Fix: preserve request body for STS signature verification
- Save and restore request body in UnifiedPostHandler after ParseForm()
- This allows STS handler to verify signatures correctly
- Fixes 'invalid AWS signature: 53' error (ErrContentSHA256Mismatch)
- ParseForm() consumes the body, so we need to restore it for downstream handlers
* Improve error handling in UnifiedPostHandler
- Add http.MaxBytesReader to limit body size to 10 MiB (iamRequestBodyLimit)
- Add proper error handling for io.ReadAll failures
- Log errors when body reading fails
- Prevents DoS attacks from oversized request bodies
- Addresses code review feedback
* Fix STS AssumeRole with POST body param and add integration test
* Add STS integration test to CI workflow
* Address code review feedback: fix HPP vulnerability and style issues
* Refactor: address code review feedback
- Fix HTTP Parameter Pollution vulnerability in UnifiedPostHandler
- Refactor permission check logic for better readability
- Extract test helpers to testutil/docker.go to reduce duplication
- Clean up imports and simplify context setting
* Add SigV4-style test variant for AssumeRole POST body routing
- Added ActionInBodyWithSigV4Style test case to validate real-world scenario
- Test confirms routing works correctly for AWS SigV4-signed requests
- Addresses code review feedback about testing with SigV4 signatures
* Fix: always set identity in context when non-nil
- Ensure UnifiedPostHandler always calls SetIdentityInContext when identity is non-nil
- Only call SetIdentityNameInContext when identity.Name is non-empty
- This ensures downstream handlers (embeddedIam.DoActions) always have access to identity
- Addresses potential issue where empty identity.Name would skip context setting
* fix LevelDB panic on lazy reload
Implemented a thread-safe reload mechanism using double-checked
locking and a retry loop in Get, Put, and Delete. Added a concurrency
test to verify the fix and prevent regressions.
Fixes#8269
* refactor: use helper for leveldb fix and remove deprecated ioutil
* fix: prevent deadlock by using getFromDb helper
Extracted DB lookup to internal helper to avoid recursive RLock in Put/Delete methods.
Updated Get to use the helper as well.
* fix: resolve syntax error and commit deadlock prevention
Fixed a duplicate function declaration syntax error.
Verified that getFromDb helper correctly prevents recursive RLock scenarios.
* refactor: remove redundant timeout checks
Removed nested `if m.ldbTimeout > 0` checks in Get, Put, and Delete
methods as suggested in PR review.
When the `--files` flag is present, `cluster.status` will scrape file metrics
from volume servers to provide detailed stats on those. The progress indicator
was not being updated properly though, so the command would complete before
it read 100%.
* Fix volume.fsck 401 Unauthorized by adding JWT to HTTP delete requests
* Additionally, for performance, consider fetching the jwt.filer_signing.key once before any loops that call httpDelete, rather than inside httpDelete itself, to avoid repeated configuration lookups.
* admin: fix file browser page size selector
Fix file browser pagination page-size selectors to use explicit select IDs instead of this.value in templ-generated handlers, which could resolve to undefined and produce limit=undefined in requests.
Add a focused template render regression test to prevent this from recurring.
Fixes#8284
* revert file browser template regression test
* fix ec.encode skipping volumes when one replica is on a full disk
This fixes issue #8218. Previously, ec.encode would skip a volume if ANY
of its replicas resided on a disk with low free volume count. Now it
accepts the volume if AT LEAST ONE replica is on a healthy disk.
* refine noFreeDisk counter logic in ec.encode
Ensure noFreeDisk is decremented if a volume initially marked as bad
is later found to have a healthy replica. This ensures accurate
summary statistics.
* defer noFreeDisk counting and refine logging in ec.encode
Updated logging to be replica-scoped and deferred noFreeDisk counting to
the final pass over vidMap. This ensures that the counter only reflects
volumes that are definitively excluded because all replicas are on full
disks.
* filter replicas by free space during ec.encode
Updated doEcEncode to filter out replicas on disks with
FreeVolumeCount < 2 before selecting the best replica for encoding.
This ensures that EC shards are not generated on healthy source
replicas that happen to be on disks with low free space.
* fix issue #8230: volume.fsck deletion logic to respect purgeAbsent flag
This commit fixes two issues in volume.fsck:
1. Missing chunks in existing volumes are now deleted if -reallyDeleteFilerEntries is set.
2. Missing volumes are now properly handled when a -volumeId filter is specified, allowing deletion of filer entries for those volumes.
* address PR feedback for issue #8230
- Ensure volume filter is applied before reporting missing volumes
- Fix potential nil-pointer dereferences in httpDelete method
- Use proper error checking throughout httpDelete
* address second round PR feedback for issue #8230
- Use fmt.Fprintf(c.writer, ...) instead of fmt.Printf
- Add missing newline in "deleting path" log message