seaweedfs

Commit Graph

Author	SHA1	Message	Date
chrislu	abc6bfd3b4	more efficient	1 month ago
chrislu	5eb46131b6	simplify	1 month ago
chrislu	7c0bdd43a9	refactoring	1 month ago
chrislu	6acfc16eda	refactoring	1 month ago
chrislu	2b5444528d	check in-flight items also; dedup code	1 month ago
chrislu	476db20bfe	address comment	1 month ago
chrislu	d9fce2fd71	process retried deletions	1 month ago
chrislu	eb003ffd25	dedup	1 month ago
chrislu	00db26af68	address comments	1 month ago
chrislu	17f6b8bd55	refactor	1 month ago
chrislu	6858a77f18	address comment	1 month ago
chrislu	0790bb3177	assert	1 month ago
chrislu	eff82a8a66	use constant	1 month ago
chrislu	00f8711789	refactoring retrying	1 month ago
Dimonyga	f5baddb134	Filer: Add code quality improvements for deletion retry Address PR feedback with minor optimizations: - Add MaxLoggedErrorDetails constant (replaces magic number 10) - Pre-allocate slices and maps in processRetryBatch for efficiency - Improve log message formatting to use constant These changes improve code maintainability and runtime performance without altering functionality. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	1 month ago
Dimonyga	69b16334c2	Filer: Add persistence docs and comprehensive unit tests Documentation improvements: 1. Document in-memory queue limitation - Acknowledge that retry queue is volatile (lost on restart) - Document trade-offs and future persistence options - Provide clear path for production hardening - Note eventual consistency through main deletion queue Unit test coverage: 1. TestDeletionRetryQueue_AddAndRetrieve - Basic add/retrieve operations - Verify items not ready before delay elapsed 2. TestDeletionRetryQueue_ExponentialBackoff - Verify exponential backoff progression (5m→10m→20m→40m→80m) - Validate delay calculations with timing tolerance 3. TestDeletionRetryQueue_OverflowProtection - Test high retry counts (60+) that could cause overflow - Verify capping at MaxRetryDelay 4. TestDeletionRetryQueue_MaxAttemptsReached - Verify items discarded after MaxRetryAttempts - Confirm proper queue cleanup 5. TestIsRetryableError - Comprehensive error pattern coverage - Test all retryable error types (timeout, connection, lookup, etc.) - Verify non-retryable errors correctly identified 6. TestDeletionRetryQueue_HeapOrdering - Verify min-heap property maintained - Test items processed in NextRetryAt order - Validate heap.Init() integration All tests passing. Addresses PR feedback on testing requirements.	1 month ago
Dimonyga	82c982e0a0	Filer: Fix critical retry count bug and add comprehensive error patterns Critical bug fixes from PR review: 1. Fix RetryCount reset bug (CRITICAL) - Problem: When items are re-queued via AddOrUpdate, RetryCount resets to 1, breaking exponential backoff - Solution: Add RequeueForRetry() method that preserves retry state - Impact: Ensures proper exponential backoff progression 2. Add overflow protection in backoff calculation - Check shift amount > 63 to prevent bit-shift overflow - Additional safety: check if delay <= 0 or > MaxRetryDelay - Protects against arithmetic overflow in extreme cases 3. Expand retryable error patterns - Added: timeout, deadline exceeded, context canceled - Added: lookup error/failed (volume discovery issues) - Added: connection refused, broken pipe (network errors) - Added: too many requests, service unavailable (backpressure) - Added: temporarily unavailable, try again (transient errors) - Added: i/o timeout (network timeouts) Benefits: - Retry mechanism now works correctly across restarts - More robust against edge cases and overflow - Better coverage of transient failure scenarios - Improved resilience in high-failure environments Addresses feedback from CodeRabbit and Gemini Code Assist in PR #7402.	1 month ago
Dmitriy Pavlov	c30c139663	Update weed/filer/filer_deletion.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	1 month ago
Dimonyga	288695030d	Filer: Refactor deletion processors for better readability Extract large callback functions into dedicated private methods to improve code organization and maintainability. Changes: 1. Extract processDeletionBatch method - Handles deletion of a batch of file IDs - Classifies errors (success, not found, retryable, permanent) - Manages retry queue additions - Consolidates logging logic 2. Extract processRetryBatch method - Handles retry attempts for previously failed deletions - Processes retry results and updates queue - Symmetric to processDeletionBatch for consistency Benefits: - Main loop functions (loopProcessingDeletion, loopProcessingDeletionRetry) are now concise and focused on orchestration - Business logic is separated into testable methods - Reduced nesting depth improves readability - Easier to understand control flow at a glance - Better separation of concerns The refactored methods follow the single responsibility principle, making the codebase more maintainable and easier to extend.	1 month ago
Dimonyga	066a588742	Filer: Modernize heap interface and improve error handling docs 1. Replace interface{} with any in heap methods - Addresses modern Go style (Go 1.18+) - Improves code readability 2. Enhance isRetryableError documentation - Acknowledge string matching brittleness - Add comprehensive TODO for future improvements: * Use HTTP status codes (503, 429, etc.) * Implement structured error types with errors.Is/As * Extract gRPC status codes * Add error wrapping for better context - Document each error pattern with context - Add defensive check for empty error strings Current implementation remains pragmatic for initial release while documenting a clear path for future robustness improvements. String matching is acceptable for now but should be replaced with structured error checking when refactoring the deletion pipeline.	1 month ago
Dimonyga	720510f6bb	Filer: Optimize retry queue with min-heap data structure Replace map-based retry queue with a min-heap for better scalability and deterministic ordering. Performance improvements: - GetReadyItems: O(N) → O(K log N) where K is items retrieved - AddOrUpdate: O(1) → O(log N) (acceptable trade-off) - Early exit when checking ready items (heap top is earliest) - No full iteration over all items while holding lock Benefits: - Deterministic processing order (earliest NextRetryAt first) - Better scalability for large retry queues (thousands of items) - Reduced lock contention duration - Memory efficient (no separate slice reconstruction) Implementation: - Min-heap ordered by NextRetryAt using container/heap - Dual index: heap for ordering + map for O(1) FileId lookups - heap.Fix() used when updating existing items - Comprehensive complexity documentation in comments This addresses the performance bottleneck identified in GetReadyItems where iterating over the entire map with a write lock could block other goroutines in high-failure scenarios.	1 month ago
Dimonyga	f36e0607e8	Filer: Replace magic numbers with named constants in retry processor Replace hardcoded values with package-level constants for better maintainability: - DeletionRetryPollInterval (1 minute): interval for checking retry queue - DeletionRetryBatchSize (1000): max items to process per iteration This improves code readability and makes configuration changes easier.	1 month ago
Dmitriy Pavlov	e8f52ccd4c	Filer: Add retry mechanism for failed file deletions Implement a retry queue with exponential backoff for handling transient deletion failures, particularly when volumes are temporarily read-only. Key features: - Automatic retry for retryable errors (read-only volumes, network issues) - Exponential backoff: 5min → 10min → 20min → ... (max 6 hours) - Maximum 10 retry attempts per file before giving up - Separate goroutine processing retry queue every minute - Map-based retry queue for O(1) lookups and deletions - Enhanced logging with retry/permanent error classification - Consistent error detail limiting (max 10 total errors logged) - Graceful shutdown support with quit channel for both processors This addresses the issue where file deletions fail when volumes are temporarily read-only (tiered volumes, maintenance, etc.) and these deletions were previously lost. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	1 month ago
Dmitriy Pavlov	b2da54afba	Update weed/filer/filer_deletion.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	1 month ago
Dimonyga	c2b7c9a89b	Filer: Add retry mechanism for failed file deletions Implement a retry queue with exponential backoff for handling transient deletion failures, particularly when volumes are temporarily read-only. Key features: - Automatic retry for retryable errors (read-only volumes, network issues) - Exponential backoff: 5min → 10min → 20min → ... (max 6 hours) - Maximum 10 retry attempts per file before giving up - Separate goroutine processing retry queue every minute - Enhanced logging with retry/permanent error classification This addresses the issue where file deletions fail when volumes are temporarily read-only (tiered volumes, maintenance, etc.) and these deletions were previously lost. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	1 month ago
Chris Lu	6a8c53bc44	Filer: batch deletion operations to return individual error results (#7382 ) * batch deletion operations to return individual error results Modify batch deletion operations to return individual error results instead of one aggregated error, enabling better tracking of which specific files failed to delete (helping reduce orphan file issues). * Simplified logging logic * Optimized nested loop * handles the edge case where the RPC succeeds but connection cleanup fails * simplify * simplify * ignore 'not found' errors here	2 months ago
Aleksey Kosov	4511c2cc1f	Changes logging function (#6919 ) * updated logging methods for stores * updated logging methods for stores * updated logging methods for filer * updated logging methods for uploader and http_util * updated logging methods for weed server --------- Co-authored-by: akosov <a.kosov@kryptonite.ru>	6 months ago
Aleksey Kosov	283d9e0079	Add context with request (#6824 )	7 months ago
chrislu	d49ecde535	rename functions	1 year ago
wyang	0581ce6096	fix delete chunk failed if volumeSever specified grpc.port (#5820 ) Co-authored-by: Yang Wang <yangwang@weride.ai>	1 year ago
chrislu	a8fa78b892	refactoring	1 year ago
chrislu	464611f614	optionally skip deleting file chunks	2 years ago
chrislu	843e778875	refactor	2 years ago
chrislu	677cfb8ad1	refactor	2 years ago
chrislu	dbfbabac55	simplify	2 years ago
chrislu	70a4c98b00	refactor filer_pb.Entry and filer.Entry to use GetChunks() for later locking on reading chunks	3 years ago
Konstantin Lebedev	4a48332248	refactor error contains already deleted (#3932 )	3 years ago
Ryan Russell	e0064ae097	docs: update `fileIdsToDelete` var (#3692 ) Signed-off-by: Ryan Russell <git@ryanrussell.org> Signed-off-by: Ryan Russell <git@ryanrussell.org>	3 years ago
Konstantin Lebedev	3c3682fcce	more log detail for upload err and deleting (#3577 )	3 years ago
famosss	131d389fc4	adjust log level (#3589 )	3 years ago
chrislu	26dbc6c905	move to https://github.com/seaweedfs/seaweedfs	3 years ago
banjiaojuhao	f7f2a597dd	minor	4 years ago
banjiaojuhao	d61bea9038	[bugfix] filer: In file modification, old chunks will be mis-deleted when they are merged(Manifestized).	4 years ago
Chris Lu	2f72c24498	skip the rest logic	4 years ago
Chris Lu	5a0f92423e	use grpc and jwt	4 years ago
Chris Lu	4f31c1bb94	go fmt	5 years ago
Chris Lu	1bf22c0b5b	go fmt	5 years ago
Chris Lu	cc2bd97ad9	refactor	5 years ago
Chris Lu	effa00ed08	refactor	5 years ago
Chris Lu	290b5e2cd0	directly delete file chunks keeping current async deletions for now	5 years ago

49 Commits (abc6bfd3b48f6d0262bf84af175478714f04741c)