Browse Source

feat: auto-configure optimal volume size limit based on available disk space (#7833)

* feat: auto-configure optimal volume size limit based on available disk space

- Add calculateOptimalVolumeSizeMB() function with OS-independent disk detection
- Reuses existing stats.NewDiskStatus() which works across Linux, macOS, Windows, BSD, Solaris
- Algorithm: available disk / 100, rounded up to nearest power of 2 (64MB, 128MB, 256MB, 512MB, 1024MB)
- Volume size capped to maximum of 1GB (1024MB) for better stability
- Minimum volume size is 64MB
- Uses efficient bits.Len() for power-of-2 rounding instead of floating-point operations
- Only auto-calculates volume size if user didn't specify a custom value via -master.volumeSizeLimitMB
- Respects user-specified values without override
- Master logs whether value was auto-calculated or user-specified
- Welcome message displays the configured volume size with correct format string ordering
- Removed unused autoVolumeSizeMB variable (logging handles source tracking)

Fixes: #0

* Refactor: Consolidate volume size constants and use robust flag detection for mini mode

This commit addresses all code review feedback on the auto-optimal volume size feature:

1. **Consolidate hardcoded defaults into package-level constants**
   - Moved minVolumeSizeMB=64 and maxVolumeSizeMB=1024 from local function-scope
     constants to package-level constants for consistency and maintainability
   - All three volume size constants (min, default, max) now defined in one place

2. **Implement robust flag detection using flag.Visit()**
   - Added isFlagPassed() helper function using flag.Visit() to check if a CLI
     flag was explicitly passed on the command line
   - Replaces the previous implementation that checked if current value equals
     default (which could incorrectly assume user intent if default was specified)
   - Now correctly detects user override regardless of the actual value

3. **Restructure power-of-2 rounding logic for clarity**
   - Changed from 'only round if above min threshold' to 'always round to power-of-2
     first, then apply min/max constraints'
   - More robust: works correctly even if min/max constants are adjusted in future
   - Clearer intent: all non-zero values go through consistent rounding logic

4. **Fix import ordering**
   - Added 'flag' import (aliased to fla9 package) to support isFlagPassed()
   - Added 'math/bits' import to support power-of-2 rounding

Benefits:
- Better code organization with all volume size limits in package constants
- Correct user override detection that doesn't rely on value equality checks
- More maintainable rounding logic that's easier to understand and modify
- Consistent with SeaweedFS conventions (uses fla9 package like other commands)

* fix: Address code review feedback for volume size calculation

This commit resolves three code review comments for better code quality and robustness:

1. **Handle comma-separated directories in -dir flag**
   - The -dir flag accepts comma-separated list of directories, but the volume size
     calculation was passing the entire string to util.ResolvePath()
   - Now splits on comma and uses the first directory for disk space calculation
   - Added explanatory comment about the multi-directory support
   - Ensures the optimal size calculation works correctly in all scenarios

2. **Change disk detection failure from verbose log to warning**
   - When disk status cannot be determined, the warning is now logged via
     glog.Warningf() instead of glog.V(1).Infof()
   - Makes the event visible in default logs without requiring verbose mode
   - Better alerting for operators about fallback to default values

3. **Avoid recalculating availableMB/100 and define bytesPerMB constant**
   - Added bytesPerMB = 1024*1024 constant for clarity and reusability
   - Replaced hardcoded (1024 * 1024) with bytesPerMB constant
   - Store availableMB/100 in initialOptimalMB variable to avoid recalculation
   - Log message now references initialOptimalMB instead of recalculating
   - Improves maintainability and reduces redundant computation

All three changes maintain the same logic while improving code quality and
robustness as requested by the reviewer.

* fix: Address rounding logic, logging clarity, and disk capacity measurement issues

This commit resolves three additional code review comments to improve robustness
and clarity of the volume size calculation:

1. **Fix power-of-2 rounding logic for edge cases**
   - The previous condition 'if optimalMB > 0' created a bug: when optimalMB=1,
     bits.Len(0)=0, resulting in 1<<0=1, which is below minimum (64MB)
   - Changed to explicitly handle zero case first: 'if optimalMB == 0'
   - Separate zero-handling from power-of-2 rounding ensures correct behavior:
     * optimalMB=0 → set to minVolumeSizeMB (64)
     * optimalMB>=1 → apply power-of-2 rounding
   - Then apply min/max constraints unconditionally
   - More explicit and easier to reason about correctness

2. **Use total disk capacity instead of free space for stable configuration**
   - Changed from diskStatus.Free (available space) to diskStatus.All (total capacity)
   - Free space varies based on current disk usage at startup time
   - This caused inconsistent volume sizes: same disk could get different sizes
     depending on how full it is when the service starts
   - Using total capacity ensures predictable, stable configuration across restarts
   - Better aligns with the intended behavior of sizing based on disk capacity
   - Added explanatory comments about why total capacity is more appropriate

3. **Improve log message clarity and accuracy**
   - Updated message to clearly show:
     * 'total disk capacity' instead of vague 'available disk'
     * 'capacity/100 before rounding' to match actual calculation
     * 'clamped to [min,max]' instead of 'capped to max' to show both bounds
     * Includes min and max values in log for context
   - More accurate and helpful for operators troubleshooting volume sizing

These changes ensure the volume size calculation is both correct and predictable.

* feat: Save mini configuration to file for persistence and documentation

This commit adds persistent configuration storage for the 'weed mini' command,
saving all non-default parameters to a JSON configuration file for:

1. **Configuration Documentation**
   - All parameters actually passed on the command line are saved
   - Provides a clear record of the running configuration
   - Useful for auditing and understanding how the system is configured

2. **Persistence of Auto-Calculated Values**
   - The auto-calculated optimal volume size (master.volumeSizeLimitMB) is saved
     with a note indicating it was auto-calculated
   - On restart, if the auto-calculated value exists, it won't be recalculated
   - Users can delete the auto-calculated entry to force recalculation on next startup
   - Provides stable, predictable configuration across restarts

3. **Configuration File Location**
   - Saved to: <data-folder>/.seaweedfs/mini.config.json
   - Uses the first directory from comma-separated -dir list
   - Directory is created automatically if it doesn't exist
   - JSON format for easy parsing and manual editing

4. **Implementation Details**
   - Uses flag.Visit() to collect only explicitly passed flags
   - Distinguishes between user-specified and auto-calculated values
   - Includes helpful notes in the JSON file
   - Graceful handling of save errors (logs warnings, doesn't fail startup)

The configuration file includes all parameters such as:
- IP and port settings (master, filer, volume, admin)
- Data directories and metadata folders
- Replication and collection settings
- S3 and IAM configurations
- Performance tuning parameters (concurrency limits, timeouts, etc.)
- Auto-calculated volume size (if applicable)

Example mini.config.json output:
{
  "debug": "true",
  "dir": "/data/seaweedfs",
  "master.port": "9333",
  "filer.port": "8888",
  "volume.port": "9340",
  "master.volumeSizeLimitMB.auto": "256",
  "_note_auto_calculated": "This value was auto-calculated. Remove it to recalculate on next startup."
}

This allows operators to:
- Review what configuration was active
- Replicate the configuration on other systems
- Understand the startup behavior
- Control when auto-calculation occurs

* refactor: Change configuration file format to match command-line options format

Update the saved configuration format from JSON to shell-compatible options format
that matches how options are expected to be passed on the command line.

Configuration file: .seaweedfs/mini.options

Format: Each line contains a command-line option in the format -name=value

Benefits:
- Format is compatible with shell scripts and can be sourced
- Can be easily converted to command-line options
- Human-readable and editable
- Values with spaces are properly quoted
- Includes helpful comments explaining auto-calculated values
- Directly usable with weed mini command

The file can be used in multiple ways:
1. Extract options: cat .seaweedfs/mini.options | grep -v '^#' | tr '\n' ' '
2. Inline in command: weed mini \$(cat .seaweedfs/mini.options | grep -v '^#')
3. Manual review: cat .seaweedfs/mini.options

* refactor: Save mini.options directly to -dir folder

* docs: Update PR description with accurate algorithm and examples

Update the function documentation comments to accurately reflect the implemented
algorithm and provide real-world examples with actual calculated outputs.

Changes:
- Clarify that algorithm uses total disk capacity (not free space)
- Document exact calculation: capacity/100, round to power of 2, clamp to [64,1024]
- Add realistic examples showing input disk sizes and resulting volume sizes:
  * 10GB disk → 64MB (minimum)
  * 100GB disk → 64MB (minimum)
  * 1TB disk → 64MB (minimum)
  * 6.4TB disk → 64MB
  * 12.8TB disk → 128MB
  * 100TB disk → 1024MB (maximum)
  * 1PB disk → 1024MB (maximum)
- Include note that values are rounded to next power of 2 and capped at 1GB

This helps users understand the volume size calculation and predict what size
will be set for their specific disk configurations.

* feat: integrate configuration file loading into mini startup

- Load mini.options file at startup if it exists
- Apply loaded configuration options before normal initialization
- CLI flags override file-based configuration
- Exclude 'dir' option from being saved (environment-specific)
- Configuration file format: option=value without leading dashes
- Auto-calculated volume size persists with recalculation marker
pull/7837/head
Chris Lu 2 days ago
committed by GitHub
parent
commit
31cb28d9d3
No known key found for this signature in database GPG Key ID: B5690EEEBB952194
  1. 220
      weed/command/mini.go

220
weed/command/mini.go

@ -3,6 +3,7 @@ package command
import (
"context"
"fmt"
"math/bits"
"net"
"net/http"
"os"
@ -17,6 +18,7 @@ import (
"github.com/seaweedfs/seaweedfs/weed/security"
stats_collect "github.com/seaweedfs/seaweedfs/weed/stats"
"github.com/seaweedfs/seaweedfs/weed/util"
flag "github.com/seaweedfs/seaweedfs/weed/util/fla9"
"github.com/seaweedfs/seaweedfs/weed/util/grace"
"github.com/seaweedfs/seaweedfs/weed/worker"
"github.com/seaweedfs/seaweedfs/weed/worker/types"
@ -36,8 +38,12 @@ type MiniOptions struct {
}
const (
miniVolumeMaxDataVolumeCounts = "0" // auto-configured based on free disk space
miniVolumeMinFreeSpace = "1" // 1% minimum free space
bytesPerMB = 1024 * 1024 // Bytes per MB
miniVolumeMaxDataVolumeCounts = "0" // auto-configured based on free disk space
miniVolumeMinFreeSpace = "1" // 1% minimum free space
minVolumeSizeMB = 64 // Minimum volume size in MB
defaultMiniVolumeSizeMB = 128 // Default volume size for mini mode
maxVolumeSizeMB = 1024 // Maximum volume size in MB (1GB)
)
var (
@ -125,7 +131,7 @@ func initMiniMasterFlags() {
miniMasterOptions.portGrpc = cmdMini.Flag.Int("master.port.grpc", 0, "master server grpc listen port")
miniMasterOptions.metaFolder = cmdMini.Flag.String("master.dir", "", "data directory to store meta data, default to same as -dir specified")
miniMasterOptions.peers = cmdMini.Flag.String("master.peers", "", "all master nodes in comma separated ip:masterPort list (default: none for single master)")
miniMasterOptions.volumeSizeLimitMB = cmdMini.Flag.Uint("master.volumeSizeLimitMB", 128, "Master stops directing writes to oversized volumes (default: 128MB for mini)")
miniMasterOptions.volumeSizeLimitMB = cmdMini.Flag.Uint("master.volumeSizeLimitMB", defaultMiniVolumeSizeMB, "Master stops directing writes to oversized volumes (default: 128MB for mini)")
miniMasterOptions.volumePreallocate = cmdMini.Flag.Bool("master.volumePreallocate", false, "Preallocate disk space for volumes.")
miniMasterOptions.maxParallelVacuumPerServer = cmdMini.Flag.Int("master.maxParallelVacuumPerServer", 1, "maximum number of volumes to vacuum in parallel on one volume server")
miniMasterOptions.defaultReplication = cmdMini.Flag.String("master.defaultReplication", "", "Default replication type if not specified.")
@ -256,8 +262,196 @@ func init() {
initMiniAdminFlags()
}
// calculateOptimalVolumeSizeMB calculates optimal volume size based on total disk capacity.
//
// Algorithm:
// 1. Read total disk capacity using the OS-independent stats.NewDiskStatus()
// 2. Divide total disk capacity by 100 to estimate optimal volume size
// 3. Round up to nearest power of 2 (64MB, 128MB, 256MB, 512MB, 1024MB, etc.)
// 4. Clamp the result to range [64MB, 1024MB]
//
// Examples (values are rounded to next power of 2 and capped at 1GB):
// - 10GB disk → 10 / 100 = 0.1MB → rounds to 64MB (minimum)
// - 100GB disk → 100 / 100 = 1MB → rounds to 1MB, clamped to 64MB (minimum)
// - 500GB disk → 500 / 100 = 5MB → rounds to 8MB, clamped to 64MB (minimum)
// - 1TB disk → 1000 / 100 = 10MB → rounds to 16MB, clamped to 64MB (minimum)
// - 6.4TB disk → 6400 / 100 = 64MB → rounds to 64MB
// - 12.8TB disk → 12800 / 100 = 128MB → rounds to 128MB
// - 100TB disk → 100000 / 100 = 1000MB → rounds to 1024MB (maximum)
// - 1PB disk → 1000000 / 100 = 10000MB → capped at 1024MB (maximum)
func calculateOptimalVolumeSizeMB(dataFolder string) uint {
// Get disk status for the data folder using OS-independent function
diskStatus := stats_collect.NewDiskStatus(dataFolder)
if diskStatus == nil || diskStatus.All == 0 {
glog.Warningf("Could not determine disk size, using default %dMB", defaultMiniVolumeSizeMB)
return defaultMiniVolumeSizeMB
}
// Calculate optimal size: total disk capacity / 100 for stability
// Using total capacity (All) instead of free space ensures consistent volume size
// regardless of current disk usage. diskStatus.All is in bytes, convert to MB
totalCapacityMB := diskStatus.All / bytesPerMB
initialOptimalMB := uint(totalCapacityMB / 100)
optimalMB := initialOptimalMB
// Round up to nearest power of 2: 64MB, 128MB, 256MB, 512MB, etc.
// Minimum is 64MB, maximum is 1024MB (1GB)
if optimalMB == 0 {
// If the computed optimal size is 0, start from the minimum volume size
optimalMB = minVolumeSizeMB
} else {
// Round up to the nearest power of 2
optimalMB = 1 << bits.Len(optimalMB-1)
}
// Apply the minimum and maximum constraints
if optimalMB < minVolumeSizeMB {
optimalMB = minVolumeSizeMB
} else if optimalMB > maxVolumeSizeMB {
optimalMB = maxVolumeSizeMB
}
glog.Infof("Optimal volume size: %dMB (total disk capacity: %dMB, capacity/100 before rounding: %dMB, rounded to nearest power of 2, clamped to [%d,%d]MB)",
optimalMB, totalCapacityMB, initialOptimalMB, minVolumeSizeMB, maxVolumeSizeMB)
return optimalMB
}
// isFlagPassed checks if a specific flag was passed on the command line
func isFlagPassed(name string) bool {
found := false
cmdMini.Flag.Visit(func(f *flag.Flag) {
if f.Name == name {
found = true
}
})
return found
}
// loadMiniConfigurationFile reads the mini.options file and returns parsed options
// File format: one option per line, without leading dash (e.g., "ip=127.0.0.1")
func loadMiniConfigurationFile(dataFolder string) (map[string]string, error) {
configFile := filepath.Join(util.ResolvePath(util.StringSplit(dataFolder, ",")[0]), "mini.options")
options := make(map[string]string)
// Check if file exists
data, err := os.ReadFile(configFile)
if err != nil {
if os.IsNotExist(err) {
// File doesn't exist - this is OK, return empty options
return options, nil
}
glog.Warningf("Failed to read configuration file %s: %v", configFile, err)
return options, err
}
// Parse the file line by line
lines := strings.Split(string(data), "\n")
for _, line := range lines {
line = strings.TrimSpace(line)
// Skip empty lines and comments
if len(line) == 0 || strings.HasPrefix(line, "#") {
continue
}
// Remove leading dash if present
if strings.HasPrefix(line, "-") {
line = line[1:]
}
// Parse key=value
parts := strings.SplitN(line, "=", 2)
if len(parts) == 2 {
key := strings.TrimSpace(parts[0])
value := strings.TrimSpace(parts[1])
// Remove quotes if present
if (strings.HasPrefix(value, "\"") && strings.HasSuffix(value, "\"")) ||
(strings.HasPrefix(value, "'") && strings.HasSuffix(value, "'")) {
value = value[1 : len(value)-1]
}
options[key] = value
}
}
glog.Infof("Loaded %d options from configuration file %s", len(options), configFile)
return options, nil
}
// applyConfigFileOptions sets command-line flags from loaded configuration file
func applyConfigFileOptions(options map[string]string) {
for key, value := range options {
// Set the flag value if it hasn't been explicitly set on command line
flag := cmdMini.Flag.Lookup(key)
if flag != nil {
// Only set if not already set (by command line)
if flag.Value.String() == flag.DefValue {
flag.Value.Set(value)
glog.V(2).Infof("Applied config file option: %s=%s", key, value)
}
}
}
}
// saveMiniConfiguration saves the current mini configuration to a file
// The file format uses option=value format without leading dashes
func saveMiniConfiguration(dataFolder string) error {
configDir := util.ResolvePath(util.StringSplit(dataFolder, ",")[0])
if err := os.MkdirAll(configDir, 0755); err != nil {
glog.Warningf("Failed to create config directory %s: %v", configDir, err)
return err
}
configFile := filepath.Join(configDir, "mini.options")
var sb strings.Builder
sb.WriteString("#!/bin/bash\n")
sb.WriteString("# Mini server configuration\n")
sb.WriteString("# Format: option=value (no leading dash)\n")
sb.WriteString("# This file is loaded on startup if it exists\n\n")
// Collect all flags that were explicitly passed (except "dir")
cmdMini.Flag.Visit(func(f *flag.Flag) {
// Skip the "dir" option - it's environment-specific
if f.Name == "dir" {
return
}
value := f.Value.String()
// Quote the value if it contains spaces
if strings.Contains(value, " ") {
sb.WriteString(fmt.Sprintf("%s=\"%s\"\n", f.Name, value))
} else {
sb.WriteString(fmt.Sprintf("%s=%s\n", f.Name, value))
}
})
// Add auto-calculated volume size if it was computed
if !isFlagPassed("master.volumeSizeLimitMB") && miniMasterOptions.volumeSizeLimitMB != nil {
sb.WriteString(fmt.Sprintf("\n# Auto-calculated volume size based on total disk capacity\n"))
sb.WriteString(fmt.Sprintf("# Delete this line to force recalculation on next startup\n"))
sb.WriteString(fmt.Sprintf("master.volumeSizeLimitMB=%d\n", *miniMasterOptions.volumeSizeLimitMB))
}
if err := os.WriteFile(configFile, []byte(sb.String()), 0644); err != nil {
glog.Warningf("Failed to save configuration to %s: %v", configFile, err)
return err
}
glog.Infof("Mini configuration saved to %s", configFile)
return nil
}
func runMini(cmd *Command, args []string) bool {
// Load configuration from file if it exists
configOptions, err := loadMiniConfigurationFile(*miniDataFolders)
if err != nil {
glog.Warningf("Error loading configuration file: %v", err)
}
// Apply loaded options to flags (CLI flags will override these)
applyConfigFileOptions(configOptions)
if *miniOptions.debug {
grace.StartDebugServer(*miniOptions.debugPort)
}
@ -325,6 +519,20 @@ func runMini(cmd *Command, args []string) bool {
}
miniFilerOptions.defaultLevelDbDirectory = miniMasterOptions.metaFolder
// Calculate and set optimal volume size limit based on available disk space
// Only auto-calculate if user didn't explicitly specify a value via -master.volumeSizeLimitMB
if !isFlagPassed("master.volumeSizeLimitMB") {
// User didn't override, use auto-calculated value
// The -dir flag can accept comma-separated directories; use the first one for disk space calculation
resolvedDataFolder := util.ResolvePath(util.StringSplit(*miniDataFolders, ",")[0])
optimalVolumeSizeMB := calculateOptimalVolumeSizeMB(resolvedDataFolder)
miniMasterOptions.volumeSizeLimitMB = &optimalVolumeSizeMB
glog.Infof("Mini started with auto-calculated optimal volume size limit: %dMB", optimalVolumeSizeMB)
} else {
// User specified a custom value
glog.Infof("Mini started with user-specified volume size limit: %dMB", *miniMasterOptions.volumeSizeLimitMB)
}
miniWhiteList := util.StringSplit(*miniWhiteListOption, ",")
// Start all services with proper dependency coordination
@ -338,6 +546,9 @@ func runMini(cmd *Command, args []string) bool {
// Print welcome message after all services are running
printWelcomeMessage()
// Save configuration to file for persistence and documentation
saveMiniConfiguration(*miniDataFolders)
select {}
}
@ -633,7 +844,7 @@ const welcomeMessageTemplate = `
Volume Server: http://%s:%d
Optimized Settings:
Volume size limit: 128MB
Volume size limit: %dMB
Volume max: auto (based on free disk space)
Pre-stop seconds: 1 (faster shutdown)
Master peers: none (single master mode)
@ -673,6 +884,7 @@ func printWelcomeMessage() {
*miniIp, *miniWebDavOptions.port,
*miniIp, *miniAdminOptions.port,
*miniIp, *miniOptions.v.port,
*miniMasterOptions.volumeSizeLimitMB,
*miniDataFolders,
)

Loading…
Cancel
Save