seaweedfs

History

Chris Lu c4d642b8aa fix(ec): gather shards from all disk locations before rebuild (#8633 ) * fix(ec): gather shards from all disk locations before rebuild (#8631) Fix "too few shards given" error during ec.rebuild on multi-disk volume servers. The root cause has two parts: 1. VolumeEcShardsRebuild only looked at a single disk location for shard files. On multi-disk servers, the existing local shards could be on one disk while copied shards were placed on another, causing the rebuild to see fewer shards than actually available. 2. VolumeEcShardsCopy had a DiskId condition (req.DiskId == 0 && len(vs.store.Locations) > 0) that was always true, making the FindFreeLocation fallback dead code. This meant copies always went to Locations[0] regardless of where existing shards were. Changes: - VolumeEcShardsRebuild now finds the location with the most shards, then gathers shard files from other locations via hard links (or symlinks for cross-device) before rebuilding. Gathered files are cleaned up after rebuild. - VolumeEcShardsCopy now only uses Locations[DiskId] when DiskId > 0 (explicitly set). Otherwise, it prefers the location that already has the EC volume, falling back to HDD then any free location. - generateMissingEcFiles now logs shard counts and provides a clear error message when not enough shards are found, instead of passing through to the opaque reedsolomon "too few shards given" error. * fix(ec): update test to match skip behavior for unrepairable volumes The test expected an error for volumes with insufficient shards, but commit `5acb4578a` changed unrepairable volumes to be skipped with a log message instead of returning an error. Update the test to verify the skip behavior and log output. * fix(ec): address PR review comments - Add comment clarifying DiskId=0 means "not specified" (protobuf default), callers must use DiskId >= 1 to target a specific disk. - Log warnings on cleanup failures for gathered shard links. * fix(ec): read shard files from other disks directly instead of linking Replace the hard link / symlink gathering approach with passing additional search directories into RebuildEcFiles. The rebuild function now opens shard files directly from whichever disk they live on, avoiding filesystem link operations and cleanup. RebuildEcFiles and RebuildEcFilesWithContext gain a variadic additionalDirs parameter (backward compatible with existing callers). * fix(ec): clarify DiskId selection semantics in VolumeEcShardsCopy comment * fix(ec): avoid empty files on failed rebuild; don't skip ecx-only locations - generateMissingEcFiles: two-pass approach — first discover present/missing shards and check reconstructability, only then create output files. This avoids leaving behind empty truncated shard files when there are too few shards to rebuild. - VolumeEcShardsRebuild: compute hasEcx before skipping zero-shard locations. A location with an .ecx file but no shard files (all shards on other disks) is now a valid rebuild candidate instead of being silently skipped. * fix(ec): select ecx-only location as rebuildLocation when none chosen yet When rebuildLocation is nil and a location has hasEcx=true but existingShardCount=0 (all shards on other disks), the condition 0 > 0 was false so it was never promoted to rebuildLocation. Add rebuildLocation == nil to the predicate so the first location with an .ecx file is always selected as a candidate.		2 days ago
..
command.go	refactor	1 year ago
command_cluster_check.go	chore: execute goimports to format the code (#7983)	2 months ago
command_cluster_ps.go	chore: execute goimports to format the code (#7983)	2 months ago
command_cluster_raft_add.go	chore: execute goimports to format the code (#7983)	2 months ago
command_cluster_raft_leader_transfer.go	chore: execute goimports to format the code (#7983)	2 months ago
command_cluster_raft_leader_transfer_test.go	chore: execute goimports to format the code (#7983)	2 months ago
command_cluster_raft_ps.go	chore: execute goimports to format the code (#7983)	2 months ago
command_cluster_raft_remove.go	chore: execute goimports to format the code (#7983)	2 months ago
command_cluster_status.go	Fix file stat collection metric bug for the `cluster.status` command. (#8302)	1 month ago
command_cluster_status_test.go	Update `cluster.status` to resolve file details on EC volumes. (#8268)	1 month ago
command_collection_delete.go	Unify the parameter to disable dry-run on weed shell commands to `-apply` (instead of `-force`). (#7450)	4 months ago
command_collection_list.go	chore: execute goimports to format the code (#7983)	2 months ago
command_ec_balance.go	add admin script worker (#8491)	2 weeks ago
command_ec_common.go	add admin script worker (#8491)	2 weeks ago
command_ec_common_avoid_test.go	Enhance EC balancing to separate parity and data shards (#8038)	2 months ago
command_ec_common_test.go	fix: EC rebalance fails with replica placement 000 (#7812)	3 months ago
command_ec_decode.go	Respect -minFreeSpace during ec.decode (#8467)	2 weeks ago
command_ec_encode.go	fix ec.encode skipping volumes when one replica is on a full disk (#8227)	1 month ago
command_ec_encode_test.go	add tests	1 month ago
command_ec_rebuild.go	Fix ec.rebuild failing on unrepairable volumes instead of skipping (#8632)	3 days ago
command_ec_rebuild_test.go	fix(ec): gather shards from all disk locations before rebuild (#8633)	2 days ago
command_ec_scrub.go	Implement local scrubbing for EC volumes. (#8283)	1 month ago
command_ec_test.go	Fix reporting of EC shard sizes from nodes to masters. (#7835)	3 months ago
command_fs_cat.go	Shell: Added a helper function `isHelpRequest()` (#7380)	5 months ago
command_fs_cd.go	Shell: Added a helper function `isHelpRequest()` (#7380)	5 months ago
command_fs_configure.go	Unify the parameter to disable dry-run on weed shell commands to `-apply` (instead of `-force`). (#7450)	4 months ago
command_fs_du.go	Shell: Added a helper function `isHelpRequest()` (#7380)	5 months ago
command_fs_log.go	chore: execute goimports to format the code (#7983)	2 months ago
command_fs_ls.go	Shell: Added a helper function `isHelpRequest()` (#7380)	5 months ago
command_fs_merge_volumes.go	fix(shell): show planned size in fs.mergeVolumes log to clarify size limit check (#8553)	6 days ago
command_fs_meta_cat.go	Shell: Added a helper function `isHelpRequest()` (#7380)	5 months ago
command_fs_meta_change_volume_id.go	Fix volume.fsck -forcePurging -reallyDeleteFromVolume to fail fast on filer traversal errors (#8015)	2 months ago
command_fs_meta_load.go	use ReadFull (#40) (#8240)	1 month ago
command_fs_meta_notify.go	Fix volume.fsck -forcePurging -reallyDeleteFromVolume to fail fast on filer traversal errors (#8015)	2 months ago
command_fs_meta_save.go	shell: fix potential deadlock in fs.meta.save BFS traversal	2 months ago
command_fs_mkdir.go	Shell: Added a helper function `isHelpRequest()` (#7380)	5 months ago
command_fs_mv.go	Shell: Added a helper function `isHelpRequest()` (#7380)	5 months ago
command_fs_pwd.go	Shell: Added a helper function `isHelpRequest()` (#7380)	5 months ago
command_fs_rm.go	Shell: Added a helper function `isHelpRequest()` (#7380)	5 months ago
command_fs_tree.go	Shell: Added a helper function `isHelpRequest()` (#7380)	5 months ago
command_fs_verify.go	shell: update fs.verify and volume.fsck for new BFS signature	2 months ago
command_lock_unlock.go	chore: execute goimports to format the code (#7983)	2 months ago
command_mount_configure.go	Clean up logs and deprecated functions (#7339)	5 months ago
command_mq_balance.go	chore: execute goimports to format the code (#7983)	2 months ago
command_mq_topic_compact.go	Add Kafka Gateway (#7231)	5 months ago
command_mq_topic_configure.go	chore: execute goimports to format the code (#7983)	2 months ago
command_mq_topic_desc.go	chore: execute goimports to format the code (#7983)	2 months ago
command_mq_topic_list.go	chore: execute goimports to format the code (#7983)	2 months ago
command_mq_topic_truncate.go	Message Queue: Add sql querying (#7185)	6 months ago
command_remote_cache.go	s3: fix remote object not caching (#7790)	3 months ago
command_remote_configure.go	add one example	3 months ago
command_remote_copy_local.go	fix(shell): set LastLocalSyncTsNs in remote.copy.local so remote.uncache works (#8604)	6 days ago
command_remote_meta_sync.go	Fix remote.meta.sync TTL issue (#8021) (#8030)	2 months ago
command_remote_mount.go	feat(filer): add lazy directory listing for remote mounts (#8615)	4 days ago
command_remote_mount_buckets.go	feat(remote.mount): add -metadataStrategy flag to control metadata caching (#8568)	5 days ago
command_remote_uncache.go	shell: add minCacheAge flag to remote.uncache command (#8225)	1 month ago
command_remote_uncache_test.go	shell: add minCacheAge flag to remote.uncache command (#8225)	1 month ago
command_remote_unmount.go	chore: execute goimports to format the code (#7983)	2 months ago
command_s3_bucket_create.go	feat(shell): add Object Lock management commands (#8141)	2 months ago
command_s3_bucket_delete.go	fix: admin UI bucket delete now properly deletes collection and checks Object Lock (#7734)	3 months ago
command_s3_bucket_list.go	shell: add -owner flag to s3.bucket.create command (#7728)	3 months ago
command_s3_bucket_lock.go	feat(shell): add Object Lock management commands (#8141)	2 months ago
command_s3_bucket_owner.go	shell: add -owner flag to s3.bucket.create command (#7728)	3 months ago
command_s3_bucket_quota.go	convert error fromating to %w everywhere (#6995)	8 months ago
command_s3_bucket_quota_check.go	Reduce memory allocations in hot paths (#7725)	3 months ago
command_s3_bucket_versioning.go	Chart createBuckets config #8368: Add TTL, Object Lock, and Versioning support (#8375)	3 weeks ago
command_s3_circuitbreaker.go	chore(deps): bump gocloud.dev from 0.40.0 to 0.41.0 (#6679)	12 months ago
command_s3_circuitbreaker_test.go	refactor(shell): readability improvements (#3704)	4 years ago
command_s3_clean_uploads.go	convert error fromating to %w everywhere (#6995)	8 months ago
command_s3_configure.go	fix(iam): ensure access key status is persisted and defaulted to Active (#8341)	1 month ago
command_s3_policy.go	IAM Policy Management via gRPC (#8109)	2 months ago
command_s3tables_bucket.go	Add s3tables shell and admin UI (#8172)	2 months ago
command_s3tables_namespace.go	Add s3tables shell and admin UI (#8172)	2 months ago
command_s3tables_table.go	Add s3tables shell and admin UI (#8172)	2 months ago
command_s3tables_tag.go	Add s3tables shell and admin UI (#8172)	2 months ago
command_sleep.go	add admin script worker (#8491)	2 weeks ago
command_volume_balance.go	[shell]: volume balance capacity by min volume density (#8026)	4 weeks ago
command_volume_balance_test.go	Shell: support regular expression for collection selection (#7158)	7 months ago
command_volume_check_disk.go	Fix handling of fixed read-only volumes for `volume.check.disk`. (#7612)	3 months ago
command_volume_check_disk_test.go	Mutex command output writes for `volume.check.disk`. (#7605)	3 months ago
command_volume_configure_replication.go	Fix #8040: Support '_default' keyword in collectionPattern to match default collection (#8046)	2 months ago
command_volume_copy.go	Fix live volume move tail timestamp (#8440)	3 weeks ago
command_volume_delete.go	chore: execute goimports to format the code (#7983)	2 months ago
command_volume_delete_empty.go	Unify the parameter to disable dry-run on weed shell commands to `-apply` (instead of `-force`). (#7450)	4 months ago
command_volume_fix_replication.go	Fix #8040: Support '_default' keyword in collectionPattern to match default collection (#8046)	2 months ago
command_volume_fix_replication_test.go	Fix/copy before delete replication (#6064)	1 year ago
command_volume_fsck.go	Fix volume.fsck crashing on EC volumes and add multi-volume vacuum support (#8406)	3 weeks ago
command_volume_grow.go	chore: execute goimports to format the code (#7983)	2 months ago
command_volume_list.go	Fix inconsistent TTL reporting in volume.list #8158 (#8164)	2 months ago
command_volume_list_test.go	opt: reduce ShardsInfo memory usage with bitmap and sorted slice (#7974)	2 months ago
command_volume_mark.go	refactor	1 year ago
command_volume_merge.go	Adds volume.merge command with deduplication and disk-based backend (#8441)	3 weeks ago
command_volume_merge_test.go	go fmt	3 weeks ago
command_volume_mount.go	chore: execute goimports to format the code (#7983)	2 months ago
command_volume_move.go	Fix live volume move tail timestamp (#8440)	3 weeks ago
command_volume_replica_check.go	fix: sync replica entries before ec.encode and volume.tier.move (#7798)	3 months ago
command_volume_replica_check_test.go	chore: execute goimports to format the code (#7983)	2 months ago
command_volume_scrub.go	Implement full scrubbing for regular volumes (#8254)	1 month ago
command_volume_server_evacuate.go	Enhance EC balancing to separate parity and data shards (#8038)	2 months ago
command_volume_server_evacuate_test.go	fix tests	3 years ago
command_volume_server_leave.go	Unify the parameter to disable dry-run on weed shell commands to `-apply` (instead of `-force`). (#7450)	4 months ago
command_volume_server_state.go	Add weed shell command `volumeServer.state` to query/update volume server state settings. (#8271)	1 month ago
command_volume_tier_download.go	Shell: support regular expression for collection selection (#7158)	7 months ago
command_volume_tier_move.go	Fix #8040: Support '_default' keyword in collectionPattern to match default collection (#8046)	2 months ago
command_volume_tier_upload.go	Shell: support regular expression for collection selection (#7158)	7 months ago
command_volume_unmount.go	chore: execute goimports to format the code (#7983)	2 months ago
command_volume_vacuum.go	Fix volume.fsck crashing on EC volumes and add multi-volume vacuum support (#8406)	3 weeks ago
command_volume_vacuum_disable.go	refactor	1 year ago
command_volume_vacuum_enable.go	refactor	1 year ago
commands.go	add admin script worker (#8491)	2 weeks ago
common.go	Fix #8040: Support '_default' keyword in collectionPattern to match default collection (#8046)	2 months ago
common_test.go	Humanize output for `weed.server` by default (#7758)	3 months ago
ec_proportional_rebalance.go	Fix reporting of EC shard sizes from nodes to masters. (#7835)	3 months ago
ec_proportional_rebalance_test.go	fix: EC rebalance fails with replica placement 000 (#7812)	3 months ago
ec_rebalance_slots_test.go	opt: reduce ShardsInfo memory usage with bitmap and sorted slice (#7974)	2 months ago
s3tables_helpers.go	Add s3tables shell and admin UI (#8172)	2 months ago
shell_liner.go	shell: allow spaces in arguments via quoting (#8157) (#8165)	2 months ago
shell_liner_test.go	shell: allow spaces in arguments via quoting (#8157) (#8165)	2 months ago
volume.ecshards.txt	Humanize output for `weed.server` by default (#7758)	3 months ago
volume.list.txt	Humanize output for `weed.server` by default (#7758)	3 months ago
volume.list2.txt	Humanize output for `weed.server` by default (#7758)	3 months ago