# Filer Benchmark Tool A simple Go program to benchmark SeaweedFS filer performance and detect race conditions with concurrent file operations. ## Overview This tool creates 300 (configurable) goroutines that concurrently: 1. Create empty files on the filer 2. Add multiple chunks to each file (with fake file IDs) 3. Verify the file was created successfully This simulates the race condition scenario from [Issue #7062](https://github.com/seaweedfs/seaweedfs/issues/7062) where concurrent operations can lead to metadata inconsistencies. ## Usage ### Build and Run Directly ```bash # Build the tool go build -o bin/filer_benchmark ./cmd/filer_benchmark/ # Basic usage (single filer) ./bin/filer_benchmark -filers=localhost:8888 # Test with multiple filers ./bin/filer_benchmark -filers=localhost:8888,localhost:8889,localhost:8890 # High concurrency race condition test ./bin/filer_benchmark -goroutines=500 -loops=200 -verbose ``` ### Using Helper Scripts ```bash # Use the wrapper script with predefined configurations ./scripts/run_filer_benchmark.sh # Run example test suite ./examples/run_filer_race_test.sh ``` ## Configuration Options | Flag | Default | Description | |------|---------|-------------| | `-filers` | `localhost:8888` | Comma-separated list of filer addresses | | `-goroutines` | `300` | Number of concurrent goroutines | | `-loops` | `100` | Number of operations per goroutine | | `-chunkSize` | `1048576` | Chunk size in bytes (1MB) | | `-chunksPerFile` | `5` | Number of chunks per file | | `-testDir` | `/benchmark` | Test directory on filer | | `-verbose` | `false` | Enable verbose error logging | ## Race Condition Detection The tool detects race conditions by monitoring for these error patterns: - `leveldb: closed` - Metadata cache closed during operation - `transport is closing` - gRPC connection closed during operation - `connection refused` - Network connectivity issues - `not found after creation` - File disappeared after being created ## Example Output ``` ============================================================ FILER BENCHMARK RESULTS ============================================================ Configuration: Filers: localhost:8888,localhost:8889,localhost:8890 Goroutines: 300 Loops per goroutine: 100 Chunks per file: 5 Chunk size: 1048576 bytes Results: Total operations attempted: 30000 Files successfully created: 29850 Total chunks added: 149250 Errors: 150 Race condition errors: 23 Success rate: 99.50% Performance: Total duration: 45.2s Operations/second: 663.72 Files/second: 660.18 Chunks/second: 3300.88 Race Condition Analysis: Race condition rate: 0.0767% Race conditions detected: 23 🟡 MODERATE race condition rate Overall error rate: 0.50% ============================================================ ``` ## Test Scenarios ### 1. Basic Functionality Test ```bash ./bin/filer_benchmark -goroutines=20 -loops=10 ``` Low concurrency test to verify basic functionality. ### 2. Race Condition Reproduction ```bash ./bin/filer_benchmark -goroutines=500 -loops=100 -verbose ``` High concurrency test designed to trigger race conditions. ### 3. Multi-Filer Load Test ```bash ./bin/filer_benchmark -filers=filer1:8888,filer2:8888,filer3:8888 -goroutines=300 ``` Distribute load across multiple filers. ### 4. Small Files Benchmark ```bash ./bin/filer_benchmark -chunkSize=4096 -chunksPerFile=1 -goroutines=1000 ``` Test with many small files to stress metadata operations. ## How It Simulates Race Conditions 1. **Concurrent Operations**: Multiple goroutines perform file operations simultaneously 2. **Random Timing**: Small random delays create timing variations 3. **Fake Chunks**: Uses file IDs without actual volume server data to focus on metadata operations 4. **Verification Step**: Attempts to read files immediately after creation to catch race conditions 5. **Multiple Filers**: Distributes load randomly across multiple filer instances ## Prerequisites - SeaweedFS master server running - SeaweedFS filer server(s) running - Go 1.19+ for building - Network connectivity to filer endpoints ## Integration with Issue #7062 This tool reproduces the core problem from the original issue: - **Concurrent file operations** (simulated by goroutines) - **Metadata race conditions** (detected through error patterns) - **Transport disconnections** (monitored in error analysis) - **File inconsistencies** (caught by verification steps) The key difference is this tool focuses on the filer metadata layer rather than the full CSI driver + mount stack, making it easier to isolate and debug the race condition. ## Debugging Findings ### Multi-Filer vs Single-Filer Connection Issue **Problem**: When using multiple filers with independent stores (non-shared backend), the benchmark may fail with errors like: - `update entry with chunks failed: rpc error: code = Unknown desc = not found /benchmark/file_X: filer: no entry is found in filer store` - `CreateEntry /benchmark/file_X: /benchmark should be a directory` **Root Cause**: The issue is NOT missing metadata events, but rather the benchmark's round-robin load balancing across filers: 1. **File Creation**: Benchmark creates `file_X` on `filer1` 2. **Chunk Updates**: Benchmark tries to update `file_X` on `filer2` or `filer3` 3. **Error**: `filer2`/`filer3` don't have `file_X` in their local store yet (metadata sync delay) **Verification**: Running with single filer connection (`-filers localhost:18888`) while 3 filers are running shows **NO missed events**, confirming metadata synchronization works correctly. **Solutions**: - Ensure `/benchmark` directory exists on ALL filers before starting - Use file affinity (same filer for create/update operations) - Add retry logic for cross-filer operations - Add small delays to allow metadata sync between operations