From 31cb28d9d3f4ace011083157002af59737a3ee37 Mon Sep 17 00:00:00 2001 From: Chris Lu Date: Sun, 21 Dec 2025 12:47:27 -0800 Subject: [PATCH] feat: auto-configure optimal volume size limit based on available disk space (#7833) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat: auto-configure optimal volume size limit based on available disk space - Add calculateOptimalVolumeSizeMB() function with OS-independent disk detection - Reuses existing stats.NewDiskStatus() which works across Linux, macOS, Windows, BSD, Solaris - Algorithm: available disk / 100, rounded up to nearest power of 2 (64MB, 128MB, 256MB, 512MB, 1024MB) - Volume size capped to maximum of 1GB (1024MB) for better stability - Minimum volume size is 64MB - Uses efficient bits.Len() for power-of-2 rounding instead of floating-point operations - Only auto-calculates volume size if user didn't specify a custom value via -master.volumeSizeLimitMB - Respects user-specified values without override - Master logs whether value was auto-calculated or user-specified - Welcome message displays the configured volume size with correct format string ordering - Removed unused autoVolumeSizeMB variable (logging handles source tracking) Fixes: #0 * Refactor: Consolidate volume size constants and use robust flag detection for mini mode This commit addresses all code review feedback on the auto-optimal volume size feature: 1. **Consolidate hardcoded defaults into package-level constants** - Moved minVolumeSizeMB=64 and maxVolumeSizeMB=1024 from local function-scope constants to package-level constants for consistency and maintainability - All three volume size constants (min, default, max) now defined in one place 2. **Implement robust flag detection using flag.Visit()** - Added isFlagPassed() helper function using flag.Visit() to check if a CLI flag was explicitly passed on the command line - Replaces the previous implementation that checked if current value equals default (which could incorrectly assume user intent if default was specified) - Now correctly detects user override regardless of the actual value 3. **Restructure power-of-2 rounding logic for clarity** - Changed from 'only round if above min threshold' to 'always round to power-of-2 first, then apply min/max constraints' - More robust: works correctly even if min/max constants are adjusted in future - Clearer intent: all non-zero values go through consistent rounding logic 4. **Fix import ordering** - Added 'flag' import (aliased to fla9 package) to support isFlagPassed() - Added 'math/bits' import to support power-of-2 rounding Benefits: - Better code organization with all volume size limits in package constants - Correct user override detection that doesn't rely on value equality checks - More maintainable rounding logic that's easier to understand and modify - Consistent with SeaweedFS conventions (uses fla9 package like other commands) * fix: Address code review feedback for volume size calculation This commit resolves three code review comments for better code quality and robustness: 1. **Handle comma-separated directories in -dir flag** - The -dir flag accepts comma-separated list of directories, but the volume size calculation was passing the entire string to util.ResolvePath() - Now splits on comma and uses the first directory for disk space calculation - Added explanatory comment about the multi-directory support - Ensures the optimal size calculation works correctly in all scenarios 2. **Change disk detection failure from verbose log to warning** - When disk status cannot be determined, the warning is now logged via glog.Warningf() instead of glog.V(1).Infof() - Makes the event visible in default logs without requiring verbose mode - Better alerting for operators about fallback to default values 3. **Avoid recalculating availableMB/100 and define bytesPerMB constant** - Added bytesPerMB = 1024*1024 constant for clarity and reusability - Replaced hardcoded (1024 * 1024) with bytesPerMB constant - Store availableMB/100 in initialOptimalMB variable to avoid recalculation - Log message now references initialOptimalMB instead of recalculating - Improves maintainability and reduces redundant computation All three changes maintain the same logic while improving code quality and robustness as requested by the reviewer. * fix: Address rounding logic, logging clarity, and disk capacity measurement issues This commit resolves three additional code review comments to improve robustness and clarity of the volume size calculation: 1. **Fix power-of-2 rounding logic for edge cases** - The previous condition 'if optimalMB > 0' created a bug: when optimalMB=1, bits.Len(0)=0, resulting in 1<<0=1, which is below minimum (64MB) - Changed to explicitly handle zero case first: 'if optimalMB == 0' - Separate zero-handling from power-of-2 rounding ensures correct behavior: * optimalMB=0 → set to minVolumeSizeMB (64) * optimalMB>=1 → apply power-of-2 rounding - Then apply min/max constraints unconditionally - More explicit and easier to reason about correctness 2. **Use total disk capacity instead of free space for stable configuration** - Changed from diskStatus.Free (available space) to diskStatus.All (total capacity) - Free space varies based on current disk usage at startup time - This caused inconsistent volume sizes: same disk could get different sizes depending on how full it is when the service starts - Using total capacity ensures predictable, stable configuration across restarts - Better aligns with the intended behavior of sizing based on disk capacity - Added explanatory comments about why total capacity is more appropriate 3. **Improve log message clarity and accuracy** - Updated message to clearly show: * 'total disk capacity' instead of vague 'available disk' * 'capacity/100 before rounding' to match actual calculation * 'clamped to [min,max]' instead of 'capped to max' to show both bounds * Includes min and max values in log for context - More accurate and helpful for operators troubleshooting volume sizing These changes ensure the volume size calculation is both correct and predictable. * feat: Save mini configuration to file for persistence and documentation This commit adds persistent configuration storage for the 'weed mini' command, saving all non-default parameters to a JSON configuration file for: 1. **Configuration Documentation** - All parameters actually passed on the command line are saved - Provides a clear record of the running configuration - Useful for auditing and understanding how the system is configured 2. **Persistence of Auto-Calculated Values** - The auto-calculated optimal volume size (master.volumeSizeLimitMB) is saved with a note indicating it was auto-calculated - On restart, if the auto-calculated value exists, it won't be recalculated - Users can delete the auto-calculated entry to force recalculation on next startup - Provides stable, predictable configuration across restarts 3. **Configuration File Location** - Saved to: /.seaweedfs/mini.config.json - Uses the first directory from comma-separated -dir list - Directory is created automatically if it doesn't exist - JSON format for easy parsing and manual editing 4. **Implementation Details** - Uses flag.Visit() to collect only explicitly passed flags - Distinguishes between user-specified and auto-calculated values - Includes helpful notes in the JSON file - Graceful handling of save errors (logs warnings, doesn't fail startup) The configuration file includes all parameters such as: - IP and port settings (master, filer, volume, admin) - Data directories and metadata folders - Replication and collection settings - S3 and IAM configurations - Performance tuning parameters (concurrency limits, timeouts, etc.) - Auto-calculated volume size (if applicable) Example mini.config.json output: { "debug": "true", "dir": "/data/seaweedfs", "master.port": "9333", "filer.port": "8888", "volume.port": "9340", "master.volumeSizeLimitMB.auto": "256", "_note_auto_calculated": "This value was auto-calculated. Remove it to recalculate on next startup." } This allows operators to: - Review what configuration was active - Replicate the configuration on other systems - Understand the startup behavior - Control when auto-calculation occurs * refactor: Change configuration file format to match command-line options format Update the saved configuration format from JSON to shell-compatible options format that matches how options are expected to be passed on the command line. Configuration file: .seaweedfs/mini.options Format: Each line contains a command-line option in the format -name=value Benefits: - Format is compatible with shell scripts and can be sourced - Can be easily converted to command-line options - Human-readable and editable - Values with spaces are properly quoted - Includes helpful comments explaining auto-calculated values - Directly usable with weed mini command The file can be used in multiple ways: 1. Extract options: cat .seaweedfs/mini.options | grep -v '^#' | tr '\n' ' ' 2. Inline in command: weed mini \$(cat .seaweedfs/mini.options | grep -v '^#') 3. Manual review: cat .seaweedfs/mini.options * refactor: Save mini.options directly to -dir folder * docs: Update PR description with accurate algorithm and examples Update the function documentation comments to accurately reflect the implemented algorithm and provide real-world examples with actual calculated outputs. Changes: - Clarify that algorithm uses total disk capacity (not free space) - Document exact calculation: capacity/100, round to power of 2, clamp to [64,1024] - Add realistic examples showing input disk sizes and resulting volume sizes: * 10GB disk → 64MB (minimum) * 100GB disk → 64MB (minimum) * 1TB disk → 64MB (minimum) * 6.4TB disk → 64MB * 12.8TB disk → 128MB * 100TB disk → 1024MB (maximum) * 1PB disk → 1024MB (maximum) - Include note that values are rounded to next power of 2 and capped at 1GB This helps users understand the volume size calculation and predict what size will be set for their specific disk configurations. * feat: integrate configuration file loading into mini startup - Load mini.options file at startup if it exists - Apply loaded configuration options before normal initialization - CLI flags override file-based configuration - Exclude 'dir' option from being saved (environment-specific) - Configuration file format: option=value without leading dashes - Auto-calculated volume size persists with recalculation marker --- weed/command/mini.go | 220 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 216 insertions(+), 4 deletions(-) diff --git a/weed/command/mini.go b/weed/command/mini.go index 264370069..b61da5f2e 100644 --- a/weed/command/mini.go +++ b/weed/command/mini.go @@ -3,6 +3,7 @@ package command import ( "context" "fmt" + "math/bits" "net" "net/http" "os" @@ -17,6 +18,7 @@ import ( "github.com/seaweedfs/seaweedfs/weed/security" stats_collect "github.com/seaweedfs/seaweedfs/weed/stats" "github.com/seaweedfs/seaweedfs/weed/util" + flag "github.com/seaweedfs/seaweedfs/weed/util/fla9" "github.com/seaweedfs/seaweedfs/weed/util/grace" "github.com/seaweedfs/seaweedfs/weed/worker" "github.com/seaweedfs/seaweedfs/weed/worker/types" @@ -36,8 +38,12 @@ type MiniOptions struct { } const ( - miniVolumeMaxDataVolumeCounts = "0" // auto-configured based on free disk space - miniVolumeMinFreeSpace = "1" // 1% minimum free space + bytesPerMB = 1024 * 1024 // Bytes per MB + miniVolumeMaxDataVolumeCounts = "0" // auto-configured based on free disk space + miniVolumeMinFreeSpace = "1" // 1% minimum free space + minVolumeSizeMB = 64 // Minimum volume size in MB + defaultMiniVolumeSizeMB = 128 // Default volume size for mini mode + maxVolumeSizeMB = 1024 // Maximum volume size in MB (1GB) ) var ( @@ -125,7 +131,7 @@ func initMiniMasterFlags() { miniMasterOptions.portGrpc = cmdMini.Flag.Int("master.port.grpc", 0, "master server grpc listen port") miniMasterOptions.metaFolder = cmdMini.Flag.String("master.dir", "", "data directory to store meta data, default to same as -dir specified") miniMasterOptions.peers = cmdMini.Flag.String("master.peers", "", "all master nodes in comma separated ip:masterPort list (default: none for single master)") - miniMasterOptions.volumeSizeLimitMB = cmdMini.Flag.Uint("master.volumeSizeLimitMB", 128, "Master stops directing writes to oversized volumes (default: 128MB for mini)") + miniMasterOptions.volumeSizeLimitMB = cmdMini.Flag.Uint("master.volumeSizeLimitMB", defaultMiniVolumeSizeMB, "Master stops directing writes to oversized volumes (default: 128MB for mini)") miniMasterOptions.volumePreallocate = cmdMini.Flag.Bool("master.volumePreallocate", false, "Preallocate disk space for volumes.") miniMasterOptions.maxParallelVacuumPerServer = cmdMini.Flag.Int("master.maxParallelVacuumPerServer", 1, "maximum number of volumes to vacuum in parallel on one volume server") miniMasterOptions.defaultReplication = cmdMini.Flag.String("master.defaultReplication", "", "Default replication type if not specified.") @@ -256,8 +262,196 @@ func init() { initMiniAdminFlags() } +// calculateOptimalVolumeSizeMB calculates optimal volume size based on total disk capacity. +// +// Algorithm: +// 1. Read total disk capacity using the OS-independent stats.NewDiskStatus() +// 2. Divide total disk capacity by 100 to estimate optimal volume size +// 3. Round up to nearest power of 2 (64MB, 128MB, 256MB, 512MB, 1024MB, etc.) +// 4. Clamp the result to range [64MB, 1024MB] +// +// Examples (values are rounded to next power of 2 and capped at 1GB): +// - 10GB disk → 10 / 100 = 0.1MB → rounds to 64MB (minimum) +// - 100GB disk → 100 / 100 = 1MB → rounds to 1MB, clamped to 64MB (minimum) +// - 500GB disk → 500 / 100 = 5MB → rounds to 8MB, clamped to 64MB (minimum) +// - 1TB disk → 1000 / 100 = 10MB → rounds to 16MB, clamped to 64MB (minimum) +// - 6.4TB disk → 6400 / 100 = 64MB → rounds to 64MB +// - 12.8TB disk → 12800 / 100 = 128MB → rounds to 128MB +// - 100TB disk → 100000 / 100 = 1000MB → rounds to 1024MB (maximum) +// - 1PB disk → 1000000 / 100 = 10000MB → capped at 1024MB (maximum) +func calculateOptimalVolumeSizeMB(dataFolder string) uint { + // Get disk status for the data folder using OS-independent function + diskStatus := stats_collect.NewDiskStatus(dataFolder) + if diskStatus == nil || diskStatus.All == 0 { + glog.Warningf("Could not determine disk size, using default %dMB", defaultMiniVolumeSizeMB) + return defaultMiniVolumeSizeMB + } + + // Calculate optimal size: total disk capacity / 100 for stability + // Using total capacity (All) instead of free space ensures consistent volume size + // regardless of current disk usage. diskStatus.All is in bytes, convert to MB + totalCapacityMB := diskStatus.All / bytesPerMB + initialOptimalMB := uint(totalCapacityMB / 100) + optimalMB := initialOptimalMB + + // Round up to nearest power of 2: 64MB, 128MB, 256MB, 512MB, etc. + // Minimum is 64MB, maximum is 1024MB (1GB) + if optimalMB == 0 { + // If the computed optimal size is 0, start from the minimum volume size + optimalMB = minVolumeSizeMB + } else { + // Round up to the nearest power of 2 + optimalMB = 1 << bits.Len(optimalMB-1) + } + + // Apply the minimum and maximum constraints + if optimalMB < minVolumeSizeMB { + optimalMB = minVolumeSizeMB + } else if optimalMB > maxVolumeSizeMB { + optimalMB = maxVolumeSizeMB + } + + glog.Infof("Optimal volume size: %dMB (total disk capacity: %dMB, capacity/100 before rounding: %dMB, rounded to nearest power of 2, clamped to [%d,%d]MB)", + optimalMB, totalCapacityMB, initialOptimalMB, minVolumeSizeMB, maxVolumeSizeMB) + + return optimalMB +} + +// isFlagPassed checks if a specific flag was passed on the command line +func isFlagPassed(name string) bool { + found := false + cmdMini.Flag.Visit(func(f *flag.Flag) { + if f.Name == name { + found = true + } + }) + return found +} + +// loadMiniConfigurationFile reads the mini.options file and returns parsed options +// File format: one option per line, without leading dash (e.g., "ip=127.0.0.1") +func loadMiniConfigurationFile(dataFolder string) (map[string]string, error) { + configFile := filepath.Join(util.ResolvePath(util.StringSplit(dataFolder, ",")[0]), "mini.options") + + options := make(map[string]string) + + // Check if file exists + data, err := os.ReadFile(configFile) + if err != nil { + if os.IsNotExist(err) { + // File doesn't exist - this is OK, return empty options + return options, nil + } + glog.Warningf("Failed to read configuration file %s: %v", configFile, err) + return options, err + } + + // Parse the file line by line + lines := strings.Split(string(data), "\n") + for _, line := range lines { + line = strings.TrimSpace(line) + + // Skip empty lines and comments + if len(line) == 0 || strings.HasPrefix(line, "#") { + continue + } + + // Remove leading dash if present + if strings.HasPrefix(line, "-") { + line = line[1:] + } + + // Parse key=value + parts := strings.SplitN(line, "=", 2) + if len(parts) == 2 { + key := strings.TrimSpace(parts[0]) + value := strings.TrimSpace(parts[1]) + // Remove quotes if present + if (strings.HasPrefix(value, "\"") && strings.HasSuffix(value, "\"")) || + (strings.HasPrefix(value, "'") && strings.HasSuffix(value, "'")) { + value = value[1 : len(value)-1] + } + options[key] = value + } + } + + glog.Infof("Loaded %d options from configuration file %s", len(options), configFile) + return options, nil +} + +// applyConfigFileOptions sets command-line flags from loaded configuration file +func applyConfigFileOptions(options map[string]string) { + for key, value := range options { + // Set the flag value if it hasn't been explicitly set on command line + flag := cmdMini.Flag.Lookup(key) + if flag != nil { + // Only set if not already set (by command line) + if flag.Value.String() == flag.DefValue { + flag.Value.Set(value) + glog.V(2).Infof("Applied config file option: %s=%s", key, value) + } + } + } +} + +// saveMiniConfiguration saves the current mini configuration to a file +// The file format uses option=value format without leading dashes +func saveMiniConfiguration(dataFolder string) error { + configDir := util.ResolvePath(util.StringSplit(dataFolder, ",")[0]) + if err := os.MkdirAll(configDir, 0755); err != nil { + glog.Warningf("Failed to create config directory %s: %v", configDir, err) + return err + } + + configFile := filepath.Join(configDir, "mini.options") + + var sb strings.Builder + sb.WriteString("#!/bin/bash\n") + sb.WriteString("# Mini server configuration\n") + sb.WriteString("# Format: option=value (no leading dash)\n") + sb.WriteString("# This file is loaded on startup if it exists\n\n") + + // Collect all flags that were explicitly passed (except "dir") + cmdMini.Flag.Visit(func(f *flag.Flag) { + // Skip the "dir" option - it's environment-specific + if f.Name == "dir" { + return + } + value := f.Value.String() + // Quote the value if it contains spaces + if strings.Contains(value, " ") { + sb.WriteString(fmt.Sprintf("%s=\"%s\"\n", f.Name, value)) + } else { + sb.WriteString(fmt.Sprintf("%s=%s\n", f.Name, value)) + } + }) + + // Add auto-calculated volume size if it was computed + if !isFlagPassed("master.volumeSizeLimitMB") && miniMasterOptions.volumeSizeLimitMB != nil { + sb.WriteString(fmt.Sprintf("\n# Auto-calculated volume size based on total disk capacity\n")) + sb.WriteString(fmt.Sprintf("# Delete this line to force recalculation on next startup\n")) + sb.WriteString(fmt.Sprintf("master.volumeSizeLimitMB=%d\n", *miniMasterOptions.volumeSizeLimitMB)) + } + + if err := os.WriteFile(configFile, []byte(sb.String()), 0644); err != nil { + glog.Warningf("Failed to save configuration to %s: %v", configFile, err) + return err + } + + glog.Infof("Mini configuration saved to %s", configFile) + return nil +} + func runMini(cmd *Command, args []string) bool { + // Load configuration from file if it exists + configOptions, err := loadMiniConfigurationFile(*miniDataFolders) + if err != nil { + glog.Warningf("Error loading configuration file: %v", err) + } + // Apply loaded options to flags (CLI flags will override these) + applyConfigFileOptions(configOptions) + if *miniOptions.debug { grace.StartDebugServer(*miniOptions.debugPort) } @@ -325,6 +519,20 @@ func runMini(cmd *Command, args []string) bool { } miniFilerOptions.defaultLevelDbDirectory = miniMasterOptions.metaFolder + // Calculate and set optimal volume size limit based on available disk space + // Only auto-calculate if user didn't explicitly specify a value via -master.volumeSizeLimitMB + if !isFlagPassed("master.volumeSizeLimitMB") { + // User didn't override, use auto-calculated value + // The -dir flag can accept comma-separated directories; use the first one for disk space calculation + resolvedDataFolder := util.ResolvePath(util.StringSplit(*miniDataFolders, ",")[0]) + optimalVolumeSizeMB := calculateOptimalVolumeSizeMB(resolvedDataFolder) + miniMasterOptions.volumeSizeLimitMB = &optimalVolumeSizeMB + glog.Infof("Mini started with auto-calculated optimal volume size limit: %dMB", optimalVolumeSizeMB) + } else { + // User specified a custom value + glog.Infof("Mini started with user-specified volume size limit: %dMB", *miniMasterOptions.volumeSizeLimitMB) + } + miniWhiteList := util.StringSplit(*miniWhiteListOption, ",") // Start all services with proper dependency coordination @@ -338,6 +546,9 @@ func runMini(cmd *Command, args []string) bool { // Print welcome message after all services are running printWelcomeMessage() + // Save configuration to file for persistence and documentation + saveMiniConfiguration(*miniDataFolders) + select {} } @@ -633,7 +844,7 @@ const welcomeMessageTemplate = ` Volume Server: http://%s:%d Optimized Settings: - • Volume size limit: 128MB + • Volume size limit: %dMB • Volume max: auto (based on free disk space) • Pre-stop seconds: 1 (faster shutdown) • Master peers: none (single master mode) @@ -673,6 +884,7 @@ func printWelcomeMessage() { *miniIp, *miniWebDavOptions.port, *miniIp, *miniAdminOptions.port, *miniIp, *miniOptions.v.port, + *miniMasterOptions.volumeSizeLimitMB, *miniDataFolders, )