Tree:
8d6bcddf60
add-ec-vacuum
add-filer-iam-grpc
add-iam-grpc-management
add_fasthttp_client
add_remote_storage
adding-message-queue-integration-tests
adjust-fsck-cutoff-default
admin/csrf-s3tables
allow-no-role-arn
also-delete-parent-directory-if-empty
avoid_releasing_temp_file_on_write
changing-to-zap
codex-rust-volume-server-bootstrap
collect-public-metrics
copilot/fix-helm-chart-installation
copilot/fix-s3-object-tagging-issue
copilot/make-renew-interval-configurable
copilot/make-renew-interval-configurable-again
copilot/sub-pr-7677
create-table-snapshot-api-design
data_query_pushdown
dependabot/maven/other/java/client/com.google.protobuf-protobuf-java-3.25.5
dependabot/maven/other/java/examples/org.apache.hadoop-hadoop-common-3.4.0
detect-and-plan-ec-tasks
do-not-retry-if-error-is-NotFound
ec-disk-type-support
enhance-erasure-coding
fasthttp
feature-8113-storage-class-disk-routing
feature/mini-port-detection
feature/modernize-s3-tests
feature/s3-multi-cert-support
feature/s3tables-improvements-and-spark-tests
filer1_maintenance_branch
fix-8303-s3-lifecycle-ttl-assign
fix-GetObjectLockConfigurationHandler
fix-bucket-name-case-7910
fix-helm-fromtoml-compatibility
fix-mount-http-parallelism
fix-mount-read-throughput-7504
fix-pr-7909
fix-s3-configure-consistency
fix-s3-object-tagging-issue-7589
fix-sts-session-token-7941
fix-versioning-listing-only
fix/iceberg-stage-create-semantics
fix/object-lock-delete-enforcement
fix/sts-body-preservation
fix/windows-test-file-cleanup
ftp
gh-pages
iam-multi-file-migration
iam-permissions-and-api
improve-fuse-mount
improve-fuse-mount2
logrus
master
message_send
mount2
mq-subscribe
mq2
nfs-cookie-prefix-list-fixes
optimize-delete-lookups
original_weed_mount
plugin-system-codex
plugin-system-phase1
plugin-ui-enhancements-restored
pr-7412
pr/7984
pr/8140
raft-dual-write
random_access_file
refactor-needle-read-operations
refactor-volume-write
remote_overlay
remove-implicit-directory-handling
revert-5134-patch-1
revert-5819-patch-1
revert-6434-bugfix-missing-s3-audit
s3-remote-cache-singleflight
s3-select
s3tables-by-claude
sub
tcp_read
test-reverting-lock-table
test_udp
testing
testing-sdx-generation
tikv
track-mount-e2e
upgrade-versions-to-4.00
volume_buffered_writes
worker-execute-ec-tasks
0.72
0.72.release
0.73
0.74
0.75
0.76
0.77
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1.00
1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.10
1.11
1.12
1.14
1.15
1.16
1.17
1.18
1.19
1.20
1.21
1.22
1.23
1.24
1.25
1.26
1.27
1.28
1.29
1.30
1.31
1.32
1.33
1.34
1.35
1.36
1.37
1.38
1.40
1.41
1.42
1.43
1.44
1.45
1.46
1.47
1.48
1.49
1.50
1.51
1.52
1.53
1.54
1.55
1.56
1.57
1.58
1.59
1.60
1.61
1.61RC
1.62
1.63
1.64
1.65
1.66
1.67
1.68
1.69
1.70
1.71
1.72
1.73
1.74
1.75
1.76
1.77
1.78
1.79
1.80
1.81
1.82
1.83
1.84
1.85
1.86
1.87
1.88
1.90
1.91
1.92
1.93
1.94
1.95
1.96
1.97
1.98
1.99
1;70
2.00
2.01
2.02
2.03
2.04
2.05
2.06
2.07
2.08
2.09
2.10
2.11
2.12
2.13
2.14
2.15
2.16
2.17
2.18
2.19
2.20
2.21
2.22
2.23
2.24
2.25
2.26
2.27
2.28
2.29
2.30
2.31
2.32
2.33
2.34
2.35
2.36
2.37
2.38
2.39
2.40
2.41
2.42
2.43
2.47
2.48
2.49
2.50
2.51
2.52
2.53
2.54
2.55
2.56
2.57
2.58
2.59
2.60
2.61
2.62
2.63
2.64
2.65
2.66
2.67
2.68
2.69
2.70
2.71
2.72
2.73
2.74
2.75
2.76
2.77
2.78
2.79
2.80
2.81
2.82
2.83
2.84
2.85
2.86
2.87
2.88
2.89
2.90
2.91
2.92
2.93
2.94
2.95
2.96
2.97
2.98
2.99
3.00
3.01
3.02
3.03
3.04
3.05
3.06
3.07
3.08
3.09
3.10
3.11
3.12
3.13
3.14
3.15
3.16
3.18
3.19
3.20
3.21
3.22
3.23
3.24
3.25
3.26
3.27
3.28
3.29
3.30
3.31
3.32
3.33
3.34
3.35
3.36
3.37
3.38
3.39
3.40
3.41
3.42
3.43
3.44
3.45
3.46
3.47
3.48
3.50
3.51
3.52
3.53
3.54
3.55
3.56
3.57
3.58
3.59
3.60
3.61
3.62
3.63
3.64
3.65
3.66
3.67
3.68
3.69
3.71
3.72
3.73
3.74
3.75
3.76
3.77
3.78
3.79
3.80
3.81
3.82
3.83
3.84
3.85
3.86
3.87
3.88
3.89
3.90
3.91
3.92
3.93
3.94
3.95
3.96
3.97
3.98
3.99
4.00
4.01
4.02
4.03
4.04
4.05
4.06
4.07
4.08
4.09
4.12
4.13
dev
helm-3.65.1
v0.69
v0.70beta
v3.33
${ noResults }
8 Commits (8d6bcddf60c918f57120031ca27a67d32faab710)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
f5c666052e
|
feat: add S3 bucket size and object count metrics (#7776)
* feat: add S3 bucket size and object count metrics Adds periodic collection of bucket size metrics: - SeaweedFS_s3_bucket_size_bytes: logical size (deduplicated across replicas) - SeaweedFS_s3_bucket_physical_size_bytes: physical size (including replicas) - SeaweedFS_s3_bucket_object_count: object count (deduplicated) Collection runs every 1 minute via background goroutine that queries filer Statistics RPC for each bucket's collection. Also adds Grafana dashboard panels for: - S3 Bucket Size (logical vs physical) - S3 Bucket Object Count * address PR comments: fix bucket size metrics collection 1. Fix collectCollectionInfoFromMaster to use master VolumeList API - Now properly queries master for topology info - Uses WithMasterClient to get volume list from master - Correctly calculates logical vs physical size based on replication 2. Return error when filerClient is nil to trigger fallback - Changed from 'return nil, nil' to 'return nil, error' - Ensures fallback to filer stats is properly triggered 3. Implement pagination in listBucketNames - Added listBucketPageSize constant (1000) - Uses StartFromFileName for pagination - Continues fetching until fewer entries than limit returned 4. Handle NewReplicaPlacementFromByte error and prevent division by zero - Check error return from NewReplicaPlacementFromByte - Default to 1 copy if error occurs - Add explicit check for copyCount == 0 * simplify bucket size metrics: remove filer fallback, align with quota enforcement - Remove fallback to filer Statistics RPC - Use only master topology for collection info (same as s3.bucket.quota.enforce) - Updated comments to clarify this runs the same collection logic as quota enforcement - Simplified code by removing collectBucketSizeFromFilerStats * use s3a.option.Masters directly instead of querying filer * address PR comments: fix dashboard overlaps and improve metrics collection Grafana dashboard fixes: - Fix overlapping panels 55 and 59 in grafana_seaweedfs.json (moved 59 to y=30) - Fix grid collision in k8s dashboard (moved panel 72 to y=48) - Aggregate bucket metrics with max() by (bucket) for multi-instance S3 gateways Go code improvements: - Add graceful shutdown support via context cancellation - Use ticker instead of time.Sleep for better shutdown responsiveness - Distinguish EOF from actual errors in stream handling * improve bucket size metrics: multi-master failover and proper error handling - Initial delay now respects context cancellation using select with time.After - Use WithOneOfGrpcMasterClients for multi-master failover instead of hardcoding Masters[0] - Properly propagate stream errors instead of just logging them (EOF vs real errors) * improve bucket size metrics: distributed lock and volume ID deduplication - Add distributed lock (LiveLock) so only one S3 instance collects metrics at a time - Add IsLocked() method to LiveLock for checking lock status - Fix deduplication: use volume ID tracking instead of dividing by copyCount - Previous approach gave wrong results if replicas were missing - Now tracks seen volume IDs and counts each volume only once - Physical size still includes all replicas for accurate disk usage reporting * rename lock to s3.leader * simplify: remove StartBucketSizeMetricsCollection wrapper function * fix data race: use atomic operations for LiveLock.isLocked field - Change isLocked from bool to int32 - Use atomic.LoadInt32/StoreInt32 for all reads/writes - Sync shared isLocked field in StartLongLivedLock goroutine * add nil check for topology info to prevent panic * fix bucket metrics: use Ticker for consistent intervals, fix pagination logic - Use time.Ticker instead of time.After for consistent interval execution - Fix pagination: count all entries (not just directories) for proper termination - Update lastFileName for all entries to prevent pagination issues * address PR comments: remove redundant atomic store, propagate context - Remove redundant atomic.StoreInt32 in StartLongLivedLock (AttemptToLock already sets it) - Propagate context through metrics collection for proper cancellation on shutdown - collectAndUpdateBucketSizeMetrics now accepts ctx - collectCollectionInfoFromMaster uses ctx for VolumeList RPC - listBucketNames uses ctx for ListEntries RPC |
2 months ago |
|
|
93d0779318
|
fix: add S3 bucket traffic sent metric tracking (#7774)
* fix: add S3 bucket traffic sent metric tracking The BucketTrafficSent() function was defined but never called, causing the S3 Bucket Traffic Sent Grafana dashboard panel to not display data. Added BucketTrafficSent() calls in the streaming functions: - streamFromVolumeServers: for inline and chunked content - streamFromVolumeServersWithSSE: for encrypted range and full object requests The traffic received metric already worked because BucketTrafficReceived() was properly called in putToFiler() for both regular and multipart uploads. * feat: add S3 API Calls per Bucket panel to Grafana dashboards Added a new panel showing API calls per bucket using the existing SeaweedFS_s3_request_total metric aggregated by bucket. Updated all Grafana dashboard files: - other/metrics/grafana_seaweedfs.json - other/metrics/grafana_seaweedfs_k8s.json - other/metrics/grafana_seaweedfs_heartbeat.json - k8s/charts/seaweedfs/dashboards/seaweedfs-grafana-dashboard.json * address PR comments: use actual bytes written for traffic metrics - Use actual bytes written from w.Write instead of expected size for inline content - Add countingWriter wrapper to track actual bytes for chunked content streaming - Update streamDecryptedRangeFromChunks to return actual bytes written for SSE - Remove redundant nil check that caused linter warning - Fix duplicate panel id 86 in grafana_seaweedfs.json (changed to 90) - Fix overlapping panel positions in grafana_seaweedfs_k8s.json (rebalanced x positions) * fix grafana k8s dashboard: rebalance S3 panels to avoid overlap - Panel 86 (S3 API Calls per Bucket): w:6, x:0, y:15 - Panel 67 (S3 Request Duration 95th): w:6, x:6, y:15 - Panel 68 (S3 Request Duration 80th): w:6, x:12, y:15 - Panel 65 (S3 Request Duration 99th): w:6, x:18, y:15 All four S3 panels now fit in a single row (y:15) with width 6 each. Filer row header at y:22 and subsequent panels remain correctly positioned. * add input validation and clarify comments in adjustRangeForPart - Add validation that partStartOffset <= partEndOffset at function start - Add clarifying comments for suffix-range handling where clientEnd temporarily holds the suffix length before being reassigned * align pluginVersion for panel 86 to 10.3.1 in k8s dashboard * track partial writes for accurate egress traffic accounting - Change condition from 'err == nil' to 'written > 0' for inline content - Move BucketTrafficSent before error check for chunked content streaming - Track traffic even on partial SSE range writes - Track traffic even on partial full SSE object copies This ensures egress traffic is counted even when writes fail partway through, providing more accurate bandwidth metrics. |
2 months ago |
|
|
c7ae969c06
|
Add bucket's traffic metrics (#6444)
* Add bucket's traffic metrics * Add bucket traffic to dashboards * Fix bucket metrics help messages * Fix variable names |
1 year ago |
|
|
75f7893c11
|
feat: Add datasource as variable (#4584)
|
3 years ago |
|
|
28a1f42962
|
Helm chart monitoring improvements (#3406)
|
4 years ago |
|
|
ec51d77dcf
|
grafana: remove incorrect QPS factor
|
5 years ago |
|
|
6f501a9f0e |
add service monitors
|
5 years ago |
|
|
d59446cbb7 |
add grafana seaweedfs dashboard for k8s
|
5 years ago |