Go's ReadTTL calls fitTtlCount which normalizes to the coarsest unit
that fits: 7 days = 1 week, so "7d" becomes {Count:1, Unit:Week}
which displays as "1w". Both Go and Rust normalize identically.
Rename the Rust volume server binary to weed-volume for consistency
with the Go weed binary naming. The library crate name remains
seaweed_volume to avoid changing all internal imports.
Go's request_id.New() generates "%X%08X" (timestamp hex + random hex),
not a UUID. Update Rust to use the same format, and fix the test
assertion which expected UUID format (>= 32 chars with hyphens).
Go's flag package uses single dash (-port) while clap uses double dash
(--port). Add normalize_args_vec() to convert single-dash long options
to double-dash before clap parsing, so both formats work. Update test
framework to use single-dash flags matching Go convention.
- Add StreamingBody (http_body::Body) for chunked reads of files >1MB,
avoiding OOM by reading 64KB at a time via spawn_blocking + pread
- Add NeedleStreamInfo and meta-only read path to avoid loading full
needle body when streaming
- Add RustMultiVolumeCluster test framework and MultiCluster interface
so TestVolumeMoveHandlesInFlightWrites works with Rust volume servers
- Remove TestVolumeMoveHandlesInFlightWrites from CI skip list (now passes)
- All 117 Go integration tests + 8 Rust integration tests pass (100%)
- Add remote_storage module with RemoteStorageClient trait
- Implement S3RemoteStorageClient using aws-sdk-s3 (covers all S3-compatible
providers: AWS, Wasabi, Backblaze, Aliyun, etc.)
- FetchAndWriteNeedle now fetches data from S3, writes locally as needle,
and replicates to peers
- Add 3 integration tests using weed mini as S3 backend:
- Full round-trip fetch from S3
- Byte-range (partial) read from S3
- Error handling for non-existent S3 objects
- Add TestCluster interface in framework/cluster_interface.go
- Add StartVolumeCluster() factory that selects Go or Rust volume server
based on VOLUME_SERVER_IMPL env var
- Update all grpc/ and http/ test files to use StartVolumeCluster()
instead of StartSingleVolumeCluster(), enabling the same test suite
to run against either implementation
- cluster_rust.go: test framework to start Go master + Rust volume server
- test/volume_server/rust/: 8 integration tests (healthz, status, ping,
write/read/delete round-trip, volume lifecycle, get/set state,
server status, metrics endpoint)
- rust-volume-server-tests.yml: CI workflow with Rust unit tests and
Go+Rust integration tests
* request_id: add shared request middleware
* s3err: preserve request ids in responses and logs
* iam: reuse request ids in XML responses
* sts: reuse request ids in XML responses
* request_id: drop legacy header fallback
* request_id: use AWS-style request id format
* iam: fix AWS-compatible XML format for ErrorResponse and field ordering
- ErrorResponse uses bare <RequestId> at root level instead of
<ResponseMetadata> wrapper, matching the AWS IAM error response spec
- Move CommonResponse to last field in success response structs so
<ResponseMetadata> serializes after result elements
- Add randomness to request ID generation to avoid collisions
- Add tests for XML ordering and ErrorResponse format
* iam: remove duplicate error_response_test.go
Test is already covered by responses_test.go.
* address PR review comments
- Guard against typed nil pointers in SetResponseRequestID before
interface assertion (CodeRabbit)
- Use regexp instead of strings.Index in test helpers for extracting
request IDs (Gemini)
* request_id: prevent spoofing, fix nil-error branch, thread reqID to error writers
- Ensure() now always generates a server-side ID, ignoring client-sent
x-amz-request-id headers to prevent request ID spoofing. Uses a
private context key (contextKey{}) instead of the header string.
- writeIamErrorResponse in both iamapi and embedded IAM now accepts
reqID as a parameter instead of calling Ensure() internally, ensuring
a single request ID per request lifecycle.
- The nil-iamError branch in writeIamErrorResponse now writes a 500
Internal Server Error response instead of returning silently.
- Updated tests to set request IDs via context (not headers) and added
tests for spoofing prevention and context reuse.
* sts: add request-id consistency assertions to ActionInBody tests
* test: update admin test to expect server-generated request IDs
The test previously sent a client x-amz-request-id header and expected
it echoed back. Since Ensure() now ignores client headers to prevent
spoofing, update the test to verify the server returns a non-empty
server-generated request ID instead.
* iam: add generic WithRequestID helper alongside reflection-based fallback
Add WithRequestID[T] that uses generics to take the address of a value
type, satisfying the pointer receiver on SetRequestId without reflection.
The existing SetResponseRequestID is kept for the two call sites that
operate on interface{} (from large action switches where the concrete
type varies at runtime). Generics cannot replace reflection there since
Go cannot infer type parameters from interface{}.
* Remove reflection and generics from request ID setting
Call SetRequestId directly on concrete response types in each switch
branch before boxing into interface{}, eliminating the need for
WithRequestID (generics) and SetResponseRequestID (reflection).
* iam: return pointer responses in action dispatch
* Fix IAM error handling consistency and ensure request IDs on all responses
- UpdateUser/CreatePolicy error branches: use writeIamErrorResponse instead
of s3err.WriteErrorResponse to preserve IAM formatting and request ID
- ExecuteAction: accept reqID parameter and generate one if empty, ensuring
every response carries a RequestId regardless of caller
* Clean up inline policies on DeleteUser and UpdateUser rename
DeleteUser: remove InlinePolicies[userName] from policy storage before
removing the identity, so policies are not orphaned.
UpdateUser: move InlinePolicies[userName] to InlinePolicies[newUserName]
when renaming, so GetUserPolicy/DeleteUserPolicy work under the new name.
Both operations persist the updated policies and return an error if
the storage write fails, preventing partial state.
* Make EC detection context aware
* Update register.go
* Speed up EC detection planning
* Add tests for EC detection planner
* optimizations
detection.go: extracted ParseCollectionFilter (exported) and feed it into the detection loop so both detection and tracing share the same parsing/whitelisting logic; the detection loop now iterates on a sorted list of volume IDs, checks the context at every iteration, and only sets hasMore when there are still unprocessed groups after hitting maxResults, keeping runtime bounded while still scheduling planned tasks before returning the results.
erasure_coding_handler.go: dropped the duplicated inline filter parsing in emitErasureCodingDetectionDecisionTrace and now reuse erasurecodingtask.ParseCollectionFilter, and the summary suffix logic now only accounts for the hasMore case that can actually happen.
detection_test.go: updated the helper topology builder to use master_pb.VolumeInformationMessage (matching the current protobuf types) and tightened the cancellation/max-results tests so they reliably exercise the detection logic (cancel before calling Detection, and provide enough disks so one result is produced before the limit).
* use working directory
* fix compilation
* fix compilation
* rename
* go vet
* fix getenv
* address comments, fix error
* Enhance volume.merge command with deduplication and disk-based backend
* Fix copyVolume function call with correct argument order and missing bool parameter
* Revert "Fix copyVolume function call with correct argument order and missing bool parameter"
This reverts commit 7b4a190643.
* Fix critical issues: per-replica writable tracking, tail goroutine cancellation via done channel, and debug logging for allocation failures
* Optimize memory usage with watermark approach for duplicate detection
* Fix critical issues: swap copyVolume arguments, increase idle timeout, remove file double-close, use glog for logging
* Replace temporary file with in-memory buffer for needle blob serialization
* test(volume.merge): Add comprehensive unit and integration tests
Add 7 unit tests covering:
- Ordering by timestamp
- Cross-stream duplicate deduplication
- Empty stream handling
- Complex multi-stream deduplication
- Single stream passthrough
- Large needle ID support
- LastModified fallback when timestamp unavailable
Add 2 integration validation tests:
- TestMergeWorkflowValidation: Documents 9-stage merge workflow
- TestMergeEdgeCaseHandling: Validates 10 edge case handling
All tests passing (9/9)
* fix(volume.merge): Use time window for deduplication to handle clock skew
The same needle ID can have different timestamps on different servers due to
clock skew and replication lag. Needles with the same ID within a 5-second
time window are now treated as duplicates (same write with timestamp variance).
Key changes:
- Add mergeDeduplicationWindowNs constant (5 seconds)
- Replace exact timestamp matching with time window comparison
- Use windowInitialized flag to properly detect window transitions
- Add TestMergeNeedleStreamsTimeWindowDeduplication test
This ensures that replicated writes with slight timestamp differences are
properly deduplicated during merge, while separate updates to the same file
ID (outside the window) are preserved.
All tests passing (10/10)
* test: Add volume.merge integration tests with 5 comprehensive test cases
* test: integration tests for volume.merge command
* Fix integration tests: use TripleVolumeCluster for volume.merge testing
- Created new TripleVolumeCluster framework (cluster_triple.go) with 3 volume servers
- Rebuilt weed binary with volume.merge command compiled in
- Updated all 5 integration tests to use TripleVolumeCluster instead of DualVolumeCluster
- Tests now properly allocate volumes on 2 servers and let merge allocate on 3rd
- All 5 integration tests now pass:
- TestVolumeMergeBasic
- TestVolumeMergeReadonly
- TestVolumeMergeRestore
- TestVolumeMergeTailNeedles
- TestVolumeMergeDivergentReplicas
* Refactor test framework: use parameterized server count instead of hardcoded
- Renamed TripleVolumeCluster to MultiVolumeCluster with serverCount parameter
- Replaced hardcoded volumePort0/1/2 with slices for flexible server count
- Updated StartTripleVolumeCluster as backward-compatible wrapper calling StartMultiVolumeCluster(t, profile, 3)
- Made directory creation, port allocation, and server startup loop-based
- Updated accessor methods (VolumeAdminAddress, VolumeGRPCAddress, etc.) to support any server count
- All 5 integration tests continue to pass with new parameterized cluster framework
- Enables future testing with 2, 4, 5+ volume servers by calling StartMultiVolumeCluster directly
* Consolidate cluster frameworks: StartDualVolumeCluster now uses MultiVolumeCluster
- Made DualVolumeCluster a type alias for MultiVolumeCluster
- Updated StartDualVolumeCluster to call StartMultiVolumeCluster(t, profile, 2)
- Removed duplicate code from cluster_dual.go (now just 17 lines)
- All existing tests using StartDualVolumeCluster continue to work without changes
- Backward compatible: existing code continues to use the old function signatures
- Added wrapper functions in cluster_multi.go for StartTripleVolumeCluster
- Enables unified cluster management across all test suites
* Address PR review comments: improve error handling and clean up code
- Replace parse error swallow with proper error return
- Log cleanup and restoration errors instead of silently discarding them
- Remove unused offset field from memoryBackendFile struct
- Fix WriteAt buffer truncation bug to preserve trailing bytes
- All unit tests passing (10/10)
- Code compiles successfully
* Fix PR review findings: test improvements and code quality
- Add timeout to runWeedShell to prevent hanging
- Add server 1 readonly status verification in tests
- Assert merge fails when replicas writable (not just log output)
- Replace sleep with polling for writable restoration check
- Fix WriteAt stale data snapshot bug in memoryBackendFile
- Fix startVolume error logging to show current server log
- Fix volumePubPorts double assignment in port allocation
- Rename test to reflect behavior: DoesNotDeduplicateAcrossWindows
- Fix misleading dedup window comment
Unit tests: 10/10 passing
Binary: Compiles successfully
* Fix test assumption: merge command marks volumes readonly automatically
TestVolumeMergeReadonly was expecting merge to fail on writable volumes, but the
merge command is designed to mark volumes readonly as part of its operation. Fixed
test to verify merge succeeds on writable volumes and properly restores writable
state afterward. Removed redundant Test 2 code that duplicated the new behavior.
* fmt
* Fix deduplication logic to correctly handle same-stream vs cross-stream duplicates
The dedup map previously used only NeedleId as key, causing same-stream
overwrites to be incorrectly skipped as duplicates. Changed to track which
stream first processed each needle ID in the current window:
- Cross-stream duplicates (same ID from different streams, within window) are skipped
- Same-stream duplicates (overwrites from same stream) are kept
- Map now stores: needleId -> streamIndex of first occurrence in window
Added TestMergeNeedleStreamsSameStreamDuplicates to verify same-stream
overwrites are preserved while cross-stream duplicates are skipped.
All unit tests passing (11/11)
Binary compiles successfully
* Fix master leader election startup issue
Fixes #error-log-leader-not-selected-yet
* not useful test
* fix(iam): ensure access key status is persisted and defaulted to Active
* make pb
* update tests
* using constants