* check for nil needle map before compaction sync
When CommitCompact runs concurrently, it sets v.nm = nil under
dataFileAccessLock. CompactByIndex does not hold that lock, so
v.nm.Sync() can hit a nil pointer. Add an early nil check to
return an error instead of crashing.
Fixes#8591
* guard copyDataBasedOnIndexFile size check against nil needle map
The post-compaction size validation at line 538 accesses
v.nm.ContentSize() and v.nm.DeletedSize(). If CommitCompact has
concurrently set v.nm to nil, this causes a SIGSEGV. Skip the
validation when v.nm is nil since the actual data copy uses local
needle maps (oldNm/newNm) and is unaffected.
Fixes#8591
* use atomic.Bool for compaction flags to prevent concurrent vacuum races
The isCompacting and isCommitCompacting flags were plain bools
read and written from multiple goroutines without synchronization.
This allowed concurrent vacuums on the same volume to pass the
guard checks and run simultaneously, leading to the nil pointer
crash. Using atomic.Bool with CompareAndSwap ensures only one
compaction or commit can run per volume at a time.
Fixes#8591
* use go-version-file in CI workflows instead of hardcoded versions
Use go-version-file: 'go.mod' so CI automatically picks up the Go
version from go.mod, avoiding future version drift. Reordered
checkout before setup-go in go.yml and e2e.yml so go.mod is
available. Removed the now-unused GO_VERSION env vars.
* capture v.nm locally in CompactByIndex to close TOCTOU race
A bare nil check on v.nm followed by v.nm.Sync() has a race window
where CommitCompact can set v.nm = nil between the two. Snapshot
the pointer into a local variable so the nil check and Sync operate
on the same reference.
* add dynamic timeouts to plugin worker vacuum gRPC calls
All vacuum gRPC calls used context.Background() with no deadline,
so the plugin scheduler's execution timeout could kill a job while
a large volume compact was still in progress. Use volume-size-scaled
timeouts matching the topology vacuum approach: 3 min/GB for compact,
1 min/GB for check, commit, and cleanup.
Fixes#8591
* Revert "add dynamic timeouts to plugin worker vacuum gRPC calls"
This reverts commit 80951934c3.
* unify compaction lifecycle into single atomic flag
Replace separate isCompacting and isCommitCompacting flags with a
single isCompactionInProgress atomic.Bool. This ensures CompactBy*,
CommitCompact, Close, and Destroy are mutually exclusive — only one
can run at a time per volume.
Key changes:
- All entry points use CompareAndSwap(false, true) to claim exclusive
access. CompactByVolumeData and CompactByIndex now also guard v.nm
and v.DataBackend with local captures.
- Close() waits for the flag outside dataFileAccessLock to avoid
deadlocking with CommitCompact (which holds the flag while waiting
for the lock). It claims the flag before acquiring the lock so no
new compaction can start.
- Destroy() uses CAS instead of a racy Load check, preventing
concurrent compaction from racing with volume teardown.
- unmountVolumeByCollection no longer deletes from the map;
DeleteCollectionFromDiskLocation removes entries only after
successful Destroy, preventing orphaned volumes on failure.
Fixes#8591
* fix: use keyed fields in struct literals
- Replace unsafe reflect.StringHeader/SliceHeader with safe unsafe.String/Slice (weed/query/sqltypes/unsafe.go)
- Add field names to Type_ScalarType struct literals (weed/mq/schema/schema_builder.go)
- Add Duration field name to FlexibleDuration struct literals across test files
- Add field names to bson.D struct literals (weed/filer/mongodb/mongodb_store_kv.go)
Fixes go vet warnings about unkeyed struct literals.
* fix: remove unreachable code
- Remove unreachable return statements after infinite for loops
- Remove unreachable code after if/else blocks where all paths return
- Simplify recursive logic by removing unnecessary for loop (inode_to_path.go)
- Fix Type_ScalarType literal to use enum value directly (schema_builder.go)
- Call onCompletionFn on stream error (subscribe_session.go)
Files fixed:
- weed/query/sqltypes/unsafe.go
- weed/mq/schema/schema_builder.go
- weed/mq/client/sub_client/connect_to_sub_coordinator.go
- weed/filer/redis3/ItemList.go
- weed/mq/client/agent_client/subscribe_session.go
- weed/mq/broker/broker_grpc_pub_balancer.go
- weed/mount/inode_to_path.go
- weed/util/skiplist/name_list.go
* fix: avoid copying lock values in protobuf messages
- Use proto.Merge() instead of direct assignment to avoid copying sync.Mutex in S3ApiConfiguration (iamapi_server.go)
- Add explicit comments noting that channel-received values are already copies before taking addresses (volume_grpc_client_to_master.go)
The protobuf messages contain sync.Mutex fields from the message state, which should not be copied.
Using proto.Merge() properly merges messages without copying the embedded mutex.
* fix: correct byte array size for uint32 bit shift operations
The generateAccountId() function only needs 4 bytes to create a uint32 value.
Changed from allocating 8 bytes to 4 bytes to match the actual usage.
This fixes go vet warning about shifting 8-bit values (bytes) by more than 8 bits.
* fix: ensure context cancellation on all error paths
In broker_client_subscribe.go, ensure subscriberCancel() is called on all error return paths:
- When stream creation fails
- When partition assignment fails
- When sending initialization message fails
This prevents context leaks when an error occurs during subscriber creation.
* fix: ensure subscriberCancel called for CreateFreshSubscriber stream.Send error
Ensure subscriberCancel() is called when stream.Send fails in CreateFreshSubscriber.
* ci: add go vet step to prevent future lint regressions
- Add go vet step to GitHub Actions workflow
- Filter known protobuf lock warnings (MessageState sync.Mutex)
These are expected in generated protobuf code and are safe
- Prevents accumulation of go vet errors in future PRs
- Step runs before build to catch issues early
* fix: resolve remaining syntax and logic errors in vet fixes
- Fixed syntax errors in filer_sync.go caused by missing closing braces
- Added missing closing brace for if block and function
- Synchronized fixes to match previous commits on branch
* fix: add missing return statements to daemon functions
- Add 'return false' after infinite loops in filer_backup.go and filer_meta_backup.go
- Satisfies declared bool return type signatures
- Maintains consistency with other daemon functions (runMaster, runFilerSynchronize, runWorker)
- While unreachable, explicitly declares the return satisfies function signature contract
* fix: add nil check for onCompletionFn in SubscribeMessageRecord
- Check if onCompletionFn is not nil before calling it
- Prevents potential panic if nil function is passed
- Matches pattern used in other callback functions
* docs: clarify unreachable return statements in daemon functions
- Add comments documenting that return statements satisfy function signature
- Explains that these returns follow infinite loops and are unreachable
- Improves code clarity for future maintainers