seaweedfs

History

Chris Lu b17e2b411a Add dynamic timeouts to plugin worker vacuum gRPC calls (#8593 ) * add dynamic timeouts to plugin worker vacuum gRPC calls All vacuum gRPC calls used context.Background() with no deadline, so the plugin scheduler's execution timeout could kill a job while a large volume compact was still in progress. Use volume-size-scaled timeouts matching the topology vacuum approach: 3 min/GB for compact, 1 min/GB for check, commit, and cleanup. Fixes #8591 * scale scheduler execution timeout by volume size The scheduler's per-job execution timeout (default 240s) would kill vacuum jobs on large volumes before they finish. Three changes: 1. Vacuum detection now includes estimated_runtime_seconds in job proposals, computed as 5 min/GB of volume size. 2. The scheduler checks for estimated_runtime_seconds in job parameters and uses it as the execution timeout when larger than the default — a generic mechanism any handler can use. 3. Vacuum task gRPC calls now use the passed-in ctx as parent instead of context.Background(), so scheduler cancellation propagates to in-flight RPCs. * extend job type runtime when proposals need more time The JobTypeMaxRuntime (default 30 min) wraps both detection and execution. Its context is the parent of all per-job execution contexts, so even with per-job estimated_runtime_seconds, jobCtx would cancel everything when it expires. After detection, scan proposals for the maximum estimated_runtime_seconds. If any proposal needs more time than the remaining JobTypeMaxRuntime, create a new execution context with enough headroom. This lets large vacuum jobs complete without being killed by the job type deadline while still respecting the configured limit for normal-sized jobs. * log missing volume size metric, remove dead minimum runtime guard Add a debug log in vacuumTimeout when t.volumeSize is 0 so operators can investigate why metrics are missing for a volume. Remove the unreachable estimatedRuntimeSeconds < 180 check in buildVacuumProposal — volumeSizeGB always >= 1 (due to +1 floor), so estimatedRuntimeSeconds is always >= 300. * cap estimated runtime and fix status check context - Cap maxEstimatedRuntime and per-job timeout overrides to 8 hours to prevent unbounded timeouts from bad metrics. - Check execCtx.Err() instead of jobCtx.Err() for status reporting, since dispatch runs under execCtx which may have a longer deadline. A successful dispatch under execCtx was misreported as "timeout" when jobCtx had expired.		3 weeks ago
..
DESIGN.md	Refactor plugin system and migrate worker runtime (#8369)	1 month ago
config_store.go	simplify plugin scheduler: remove configurable IdleSleepSeconds, use constant 61s	4 weeks ago
config_store_test.go	admin: auto migrating master maintenance scripts to admin_script plugin config (#8509)	4 weeks ago
job_execution_plan.go	Refactor plugin system and migrate worker runtime (#8369)	1 month ago
lock_manager.go	add admin script worker (#8491)	1 month ago
plugin.go	simplify plugin scheduler: remove configurable IdleSleepSeconds, use constant 61s	4 weeks ago
plugin_cancel_test.go	add admin script worker (#8491)	1 month ago
plugin_config_bootstrap_test.go	Refactor plugin system and migrate worker runtime (#8369)	1 month ago
plugin_detection_test.go	add admin script worker (#8491)	1 month ago
plugin_monitor.go	Plugin scheduler: sequential iterations with max runtime (#8496)	1 month ago
plugin_monitor_test.go	Refactor plugin system and migrate worker runtime (#8369)	1 month ago
plugin_scheduler.go	Add dynamic timeouts to plugin worker vacuum gRPC calls (#8593)	3 weeks ago
plugin_scheduler_test.go	Plugin scheduler: sequential iterations with max runtime (#8496)	1 month ago
plugin_schema_prefetch.go	Refactor plugin system and migrate worker runtime (#8369)	1 month ago
registry.go	Refactor plugin system and migrate worker runtime (#8369)	1 month ago
registry_test.go	Refactor plugin system and migrate worker runtime (#8369)	1 month ago
scheduler_config.go	simplify plugin scheduler: remove configurable IdleSleepSeconds, use constant 61s	4 weeks ago
scheduler_status.go	simplify plugin scheduler: remove configurable IdleSleepSeconds, use constant 61s	4 weeks ago
scheduler_status_test.go	add admin script worker (#8491)	1 month ago
types.go	Plugin scheduler: sequential iterations with max runtime (#8496)	1 month ago