seaweedfs

Commit Graph

Author	SHA1	Message	Date
Chris Lu	dad69b4d01	Honor Go remote volume write mode	4 days ago
Chris Lu	148de7ac6c	Honor keepLocalDatFile in tier upload shortcut	4 days ago
Chris Lu	7aae2330ae	Honor access.ui without per-request JWT checks	4 days ago
Chris Lu	b7a8c3fddb	Format pending Rust server updates	4 days ago
Chris Lu	094cf751ed	Align Rust metrics admin-port test with Go behavior	4 days ago
Chris Lu	53abaae3b9	Enforce Go HTTP whitelist guards	4 days ago
Chris Lu	9175732f3e	Serve Go volume server UI assets	4 days ago
Chris Lu	9ac1b54bfb	Match Go stats endpoint payloads	4 days ago
Chris Lu	f2c834e79c	Format pending Rust source updates	4 days ago
Chris Lu	c2e938442c	Propagate request IDs across gRPC calls	4 days ago
Chris Lu	1257d68d0f	Match Go memory status payloads	4 days ago
Chris Lu	aac8af9c26	Honor cpuprofile when pprof is disabled	4 days ago
Chris Lu	c07d05ed6e	Honor images.fix.orientation on uploads	4 days ago
Chris Lu	d736aac06a	Honor maintenanceMBps during volume copy	4 days ago
Chris Lu	7e496b2a86	Honor Go HTTP idle timeout	4 days ago
Chris Lu	e9b91aab61	Match Go gRPC client transport defaults	4 days ago
Chris Lu	05e94921e2	Match Go gRPC server transport defaults	4 days ago
Chris Lu	4ecb965a17	Match Go HTTP tls policy semantics	4 days ago
Chris Lu	276e5f5796	Register gRPC reflection service	4 days ago
Chris Lu	eabec10a0a	Keep separate public port on HTTP	4 days ago
Chris Lu	590dc17331	Enforce Go gRPC TLS server policy	4 days ago
Chris Lu	1a33c0b1fb	Apply Go tls policy to HTTPS server	4 days ago
Chris Lu	2a85f2ba6f	Require grpc.ca for gRPC server TLS	4 days ago
Chris Lu	f524c0078b	Honor grpc.volume client TLS for outgoing gRPC	4 days ago
Chris Lu	e5204880f7	Honor https.client for outgoing volume HTTP	4 days ago
Chris Lu	b8e119015f	Match Go startup defaults and gRPC CA parsing	4 days ago
Chris Lu	4e2a60f687	Discover security.toml like Go	4 days ago
Chris Lu	03fc9ce32c	Align admin metrics exposure with Go	4 days ago
Chris Lu	fb714df9d8	docs: record latest parity fixes	4 days ago
Chris Lu	e918cf9bd5	http: match Go bool parsing for dl	4 days ago
Chris Lu	f3d4a39b27	http: honor multipart upload metadata fields	4 days ago
Chris Lu	0f9232d24e	grpc: return stored ttl in volume needle status	4 days ago
Chris Lu	19cae83455	docs: refresh rust volume parity execution plan	4 days ago
Chris Lu	6cd67c53e0	grpc: match Go ping behavior for filer targets	4 days ago
Chris Lu	2a840ae8ab	http: align status payload with Go volume server	4 days ago
Chris Lu	ca646b87a9	storage: match default Go idx offset width	4 days ago
Chris Lu	eaff0e074c	test: rebuild rust volume binary per test run	4 days ago
Chris Lu	68d31c9bac	test: rebuild weed binary per test process	4 days ago
Chris Lu	d18b5f3c70	test: use clap-compatible flags for rust volume	4 days ago
Chris Lu	220bb91bc6	test: detect seaweed-volume crate for rust harness	4 days ago
Chris Lu	37bb6bda49	Merge branch 'master' into rust-volume-server	4 days ago
Chris Lu	6b2b442450	iceberg: detect maintenance work per operation (#8639 ) * iceberg: detect maintenance work per operation * iceberg: ignore delete manifests during detection * iceberg: clean up detection maintenance planning * iceberg: tighten detection manifest heuristics * Potential fix for code scanning alert no. 330: Incorrect conversion between integer types Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * iceberg: tolerate per-operation detection errors * iceberg: fix fake metadata location versioning * iceberg: check snapshot expiry before manifest loads * iceberg: make expire-snapshots switch case explicit --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>	5 days ago
Chris Lu	a00eddb525	Iceberg table maintenance Phase 3: multi-spec compaction, delete handling, and metrics (#8643 ) * Add multi-partition-spec compaction and delete-aware compaction (Phase 3) Multi-partition-spec compaction: - Add SpecID to compactionBin struct and group by spec+partition key - Remove the len(specIDs) > 1 skip that blocked spec-evolved tables - Write per-spec manifests in compaction commit using specByID map - Use per-bin PartitionSpec when calling NewDataFileBuilder Delete-aware compaction: - Add ApplyDeletes config (default: true) with readBoolConfig helper - Implement position delete collection (file_path + pos Parquet columns) - Implement equality delete collection (field ID to column mapping) - Update mergeParquetFiles to filter rows via position deletes (binary search) and equality deletes (hash set lookup) - Smart delete manifest carry-forward: drop when all data files compacted - Fix EXISTING/DELETED entries to include sequence numbers Tests for multi-spec bins, delete collection, merge filtering, and end-to-end compaction with position/equality/mixed deletes. * Add structured metrics and per-bin progress to iceberg maintenance - Change return type of all four operations from (string, error) to (string, map[string]int64, error) with structured metric counts (files_merged, snapshots_expired, orphans_removed, duration_ms, etc.) - Add onProgress callback to compactDataFiles for per-bin progress - In Execute, pass progress callback that sends JobProgressUpdate with per-bin stage messages - Accumulate per-operation metrics with dot-prefixed keys (e.g. compact.files_merged) into OutputValues on completion - Update testing_api.go wrappers and integration test call sites - Add tests: TestCompactDataFilesMetrics, TestExpireSnapshotsMetrics, TestExecuteCompletionOutputValues * Address review feedback: group equality deletes by field IDs, use metric constants - Group equality deletes by distinct equality_ids sets so different delete files with different equality columns are handled correctly - Use length-prefixed type-aware encoding in buildEqualityKey to avoid ambiguity between types and collisions from null bytes - Extract metric key strings into package-level constants * Fix buildEqualityKey to use length-prefixed type-aware encoding The previous implementation used plain String() concatenation with null byte separators, which caused type ambiguity (int 123 vs string "123") and separator collisions when values contain null bytes. Now each value is serialized as "kind:length:value" for unambiguous composite keys. This fix was missed in the prior cherry-pick due to a merge conflict. * Address nitpick review comments - Document patchManifestContentToDeletes workaround: explain that iceberg-go WriteManifest cannot create delete manifests, and note the fail-fast validation on pattern match - Document makeTestEntries: note that specID field is ignored and callers should use makeTestEntriesWithSpec for multi-spec testing * fmt * Fix path normalization, manifest threshold, and artifact filename collisions - Normalize file paths in position delete collection and lookup so that absolute S3 URLs and relative paths match correctly - Fix rewriteManifests threshold check to count only data manifests (was including delete manifests in the count and metric) - Add random suffix to artifact filenames in compactDataFiles and rewriteManifests to prevent collisions between concurrent runs - Sort compaction bins by SpecID then PartitionKey for deterministic ordering across specs * Fix pos delete read, deduplicate column resolution, minor cleanups - Remove broken Column() guard in position delete reading that silently defaulted pos to 0; unconditionally extract Int64() instead - Deduplicate column resolution in readEqualityDeleteFile by calling resolveEqualityColIndices instead of inlining the same logic - Add warning log in readBoolConfig for unrecognized string values - Fix CompactDataFiles call site in integration test to capture 3 return values * Advance progress on all bins, deterministic manifest order, assert metrics - Call onProgress for every bin iteration including skipped/failed bins so progress reporting never appears stalled - Sort spec IDs before iterating specEntriesMap to produce deterministic manifest list ordering across runs - Assert expected metric keys in CompactDataFiles integration test --------- Co-authored-by: Copilot <copilot@github.com>	5 days ago
Chris Lu	e24630251c	iceberg: handle filer-backed compaction inputs (#8638 ) * iceberg: handle filer-backed compaction inputs * iceberg: preserve upsert creation times * iceberg: align compaction test schema * iceberg: tighten compact output assertion * iceberg: document compact output match * iceberg: clear stale chunks in upsert helper * iceberg: strengthen compaction integration coverage	5 days ago
Chris Lu	0afc675a55	iceberg: validate filer failover targets (#8637 ) * iceberg: validate filer failover targets * iceberg: tighten filer liveness checks * iceberg: relax filer test readiness deadline	5 days ago
Chris Lu	b868980260	fix(remote): don't send empty StorageClass in S3 uploads (#8645 ) When S3StorageClass is empty (the default), aws.String("") was passed as the StorageClass in PutObject requests. While AWS S3 treats this as "use default," S3-compatible providers (e.g. SharkTech) reject it with InvalidStorageClass. Only set StorageClass when a non-empty value is configured, letting the provider use its default. Fixes #8644	5 days ago
Chris Lu	5f19e3259f	iceberg: keep split bins within target size (#8640 )	5 days ago
Chris Lu	d9d6707401	Change iceberg compaction target file size config from bytes to MB (#8636 ) Change iceberg target_file_size config from bytes to MB Rename the config field from target_file_size_bytes to target_file_size_mb with a default of 256 (MB). The value is converted to bytes internally. This makes the config more user-friendly — entering 256 is clearer than 268435456. Co-authored-by: Copilot <copilot@github.com>	5 days ago
Chris Lu	8cde3d4486	Add data file compaction to iceberg maintenance (Phase 2) (#8503 ) * Add iceberg_maintenance plugin worker handler (Phase 1) Implement automated Iceberg table maintenance as a new plugin worker job type. The handler scans S3 table buckets for tables needing maintenance and executes operations in the correct Iceberg order: expire snapshots, remove orphan files, and rewrite manifests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add data file compaction to iceberg maintenance handler (Phase 2) Implement bin-packing compaction for small Parquet data files: - Enumerate data files from manifests, group by partition - Merge small files using parquet-go (read rows, write merged output) - Create new manifest with ADDED/DELETED/EXISTING entries - Commit new snapshot with compaction metadata Add 'compact' operation to maintenance order (runs before expire_snapshots), configurable via target_file_size_bytes and min_input_files thresholds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix memory exhaustion in mergeParquetFiles by processing files sequentially Previously all source Parquet files were loaded into memory simultaneously, risking OOM when a compaction bin contained many small files. Now each file is loaded, its rows are streamed into the output writer, and its data is released before the next file is loaded — keeping peak memory proportional to one input file plus the output buffer. * Validate bucket/namespace/table names against path traversal Reject names containing '..', '/', or '\' in Execute to prevent directory traversal via crafted job parameters. * Add filer address failover in iceberg maintenance handler Try each filer address from cluster context in order instead of only using the first one. This improves resilience when the primary filer is temporarily unreachable. * Add separate MinManifestsToRewrite config for manifest rewrite threshold The rewrite_manifests operation was reusing MinInputFiles (meant for compaction bin file counts) as its manifest count threshold. Add a dedicated MinManifestsToRewrite field with its own config UI section and default value (5) so the two thresholds can be tuned independently. * Fix risky mtime fallback in orphan removal that could delete new files When entry.Attributes is nil, mtime defaulted to Unix epoch (1970), which would always be older than the safety threshold, causing the file to be treated as eligible for deletion. Skip entries with nil Attributes instead, matching the safer logic in operations.go. * Fix undefined function references in iceberg_maintenance_handler.go Use the exported function names (ShouldSkipDetectionByInterval, BuildDetectorActivity, BuildExecutorActivity) matching their definitions in vacuum_handler.go. * Remove duplicated iceberg maintenance handler in favor of iceberg/ subpackage The IcebergMaintenanceHandler and its compaction code in the parent pluginworker package duplicated the logic already present in the iceberg/ subpackage (which self-registers via init()). The old code lacked stale-plan guards, proper path normalization, CAS-based xattr updates, and error-returning parseOperations. Since the registry pattern (default "all") makes the old handler unreachable, remove it entirely. All functionality is provided by iceberg.Handler with the reviewed improvements. * Fix MinManifestsToRewrite clamping to match UI minimum of 2 The clamp reset values below 2 to the default of 5, contradicting the UI's advertised MinValue of 2. Clamp to 2 instead. * Sort entries by size descending in splitOversizedBin for better packing Entries were processed in insertion order which is non-deterministic from map iteration. Sorting largest-first before the splitting loop improves bin packing efficiency by filling bins more evenly. * Add context cancellation check to drainReader loop The row-streaming loop in drainReader did not check ctx between iterations, making long compaction merges uncancellable. Check ctx.Done() at the top of each iteration. * Fix splitOversizedBin to always respect targetSize limit The minFiles check in the split condition allowed bins to grow past targetSize when they had fewer than minFiles entries, defeating the OOM protection. Now bins always split at targetSize, and a trailing runt with fewer than minFiles entries is merged into the previous bin. * Add integration tests for iceberg table maintenance plugin worker Tests start a real weed mini cluster, create S3 buckets and Iceberg table metadata via filer gRPC, then exercise the iceberg.Handler operations (ExpireSnapshots, RemoveOrphans, RewriteManifests) against the live filer. A full maintenance cycle test runs all operations in sequence and verifies metadata consistency. Also adds exported method wrappers (testing_api.go) so the integration test package can call the unexported handler methods. * Fix splitOversizedBin dropping files and add source path to drainReader errors The runt-merge step could leave leading bins with fewer than minFiles entries (e.g. [80,80,10,10] with targetSize=100, minFiles=2 would drop the first 80-byte file). Replace the filter-based approach with an iterative merge that folds any sub-minFiles bin into its smallest neighbor, preserving all eligible files. Also add the source file path to drainReader error messages so callers can identify which Parquet file caused a read/write failure. * Harden integration test error handling - s3put: fail immediately on HTTP 4xx/5xx instead of logging and continuing - lookupEntry: distinguish NotFound (return nil) from unexpected RPC errors (fail the test) - writeOrphan and orphan creation in FullMaintenanceCycle: check CreateEntryResponse.Error in addition to the RPC error * go fmt --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	5 days ago
Chris Lu	47799a5b4f	fix tests	5 days ago

1 2 3 4 5 ...

13312 Commits (dad69b4d016a817643d98e1e4116a3696a3c0ce7) All Branches Search

13312 Commits (dad69b4d016a817643d98e1e4116a3696a3c0ce7)

All Branches