Branch:
feature/mq-kafka-gateway-m1
add-ec-vacuum
add-foundation-db
add_fasthttp_client
add_remote_storage
adding-message-queue-integration-tests
avoid_releasing_temp_file_on_write
changing-to-zap
collect-public-metrics
create-table-snapshot-api-design
data_query_pushdown
dependabot/maven/other/java/client/com.google.protobuf-protobuf-java-3.25.5
dependabot/maven/other/java/examples/org.apache.hadoop-hadoop-common-3.4.0
detect-and-plan-ec-tasks
do-not-retry-if-error-is-NotFound
fasthttp
feature/mq-kafka-gateway-m1
filer1_maintenance_branch
fix-GetObjectLockConfigurationHandler
fix-versioning-listing-only
ftp
gh-pages
improve-fuse-mount
improve-fuse-mount2
logrus
master
message_send
mount2
mq-subscribe
mq2
original_weed_mount
random_access_file
refactor-needle-read-operations
refactor-volume-write
remote_overlay
revert-5134-patch-1
revert-5819-patch-1
revert-6434-bugfix-missing-s3-audit
s3-select
sub
tcp_read
test-reverting-lock-table
test_udp
testing
testing-sdx-generation
tikv
track-mount-e2e
volume_buffered_writes
worker-execute-ec-tasks
0.72
0.72.release
0.73
0.74
0.75
0.76
0.77
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1.00
1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.10
1.11
1.12
1.14
1.15
1.16
1.17
1.18
1.19
1.20
1.21
1.22
1.23
1.24
1.25
1.26
1.27
1.28
1.29
1.30
1.31
1.32
1.33
1.34
1.35
1.36
1.37
1.38
1.40
1.41
1.42
1.43
1.44
1.45
1.46
1.47
1.48
1.49
1.50
1.51
1.52
1.53
1.54
1.55
1.56
1.57
1.58
1.59
1.60
1.61
1.61RC
1.62
1.63
1.64
1.65
1.66
1.67
1.68
1.69
1.70
1.71
1.72
1.73
1.74
1.75
1.76
1.77
1.78
1.79
1.80
1.81
1.82
1.83
1.84
1.85
1.86
1.87
1.88
1.90
1.91
1.92
1.93
1.94
1.95
1.96
1.97
1.98
1.99
1;70
2.00
2.01
2.02
2.03
2.04
2.05
2.06
2.07
2.08
2.09
2.10
2.11
2.12
2.13
2.14
2.15
2.16
2.17
2.18
2.19
2.20
2.21
2.22
2.23
2.24
2.25
2.26
2.27
2.28
2.29
2.30
2.31
2.32
2.33
2.34
2.35
2.36
2.37
2.38
2.39
2.40
2.41
2.42
2.43
2.47
2.48
2.49
2.50
2.51
2.52
2.53
2.54
2.55
2.56
2.57
2.58
2.59
2.60
2.61
2.62
2.63
2.64
2.65
2.66
2.67
2.68
2.69
2.70
2.71
2.72
2.73
2.74
2.75
2.76
2.77
2.78
2.79
2.80
2.81
2.82
2.83
2.84
2.85
2.86
2.87
2.88
2.89
2.90
2.91
2.92
2.93
2.94
2.95
2.96
2.97
2.98
2.99
3.00
3.01
3.02
3.03
3.04
3.05
3.06
3.07
3.08
3.09
3.10
3.11
3.12
3.13
3.14
3.15
3.16
3.18
3.19
3.20
3.21
3.22
3.23
3.24
3.25
3.26
3.27
3.28
3.29
3.30
3.31
3.32
3.33
3.34
3.35
3.36
3.37
3.38
3.39
3.40
3.41
3.42
3.43
3.44
3.45
3.46
3.47
3.48
3.50
3.51
3.52
3.53
3.54
3.55
3.56
3.57
3.58
3.59
3.60
3.61
3.62
3.63
3.64
3.65
3.66
3.67
3.68
3.69
3.71
3.72
3.73
3.74
3.75
3.76
3.77
3.78
3.79
3.80
3.81
3.82
3.83
3.84
3.85
3.86
3.87
3.88
3.89
3.90
3.91
3.92
3.93
3.94
3.95
3.96
3.97
dev
helm-3.65.1
v0.69
v0.70beta
v3.33
${ item.name }
${ noResults }
8 Commits (feature/mq-kafka-gateway-m1)
Author | SHA1 | Message | Date |
---|---|---|---|
|
093b5062e4 |
Add detailed debug logging for consumer group coordination
- JoinGroup: request parsing, state transitions, member add/update, response summary - SyncGroup: pre/post state, leader/non-leader paths, assignment summary - Heartbeat: parse/validate, member/gen checks, response details - LeaveGroup: parse, pre/post state, member removal, leader transitions - OffsetCommit/Fetch: parse errors, request summaries, response counts - Assignment strategies: topic/member/partition assignment tracing - Group cleanup: expired member removal, state transitions, stuck rebalance These logs will help trace timeouts in OffsetManagement by showing exactly where coordination stalls or misaligns with client expectations. |
7 hours ago |
|
b30834cc95 |
kafka: fix deadlock issues in static membership tests
- Fix deadlock in FindStaticMember by adding FindStaticMemberLocked version - Fix deadlock in RegisterStaticMember by adding RegisterStaticMemberLocked version - Fix deadlock in UnregisterStaticMember by adding UnregisterStaticMemberLocked version - Fix GroupInstanceID parsing in parseLeaveGroupRequest method - All static membership tests now pass without deadlocks: - JoinGroup static membership (join, reconnection, dynamic members) - LeaveGroup static membership (leave, wrong instance ID validation) - DescribeGroups static membership The deadlocks occurred because protocol handlers were calling GroupCoordinator methods that tried to acquire locks on groups that were already locked by the calling handler. The fix introduces *Locked versions of these methods that assume the group is already locked by the caller. |
4 days ago |
|
de85e9b90d |
Implement incremental cooperative rebalancing protocol
- Add IncrementalCooperativeAssignmentStrategy with two-phase rebalancing: * Revocation phase: Members give up partitions that need reassignment * Assignment phase: Members receive new partitions after revocation - Implement IncrementalRebalanceState to track rebalance progress: * Phase tracking (None, Revocation, Assignment) * Revocation timeout handling with configurable timeouts * Partition tracking for revoked and pending assignments - Add sophisticated assignment logic: * Respect member topic subscriptions when distributing partitions * Calculate ideal assignments and determine necessary revocations * Support multiple topics with different subscription patterns * Minimize partition movement while ensuring fairness - Add comprehensive test coverage: * Basic assignment scenarios without rebalancing * Rebalance scenarios with revocation and assignment phases * Multiple topic scenarios with mixed subscriptions * Timeout handling and forced completion * State transition verification - Update GetAssignmentStrategy to support 'incremental-cooperative' protocol - Implement monitoring methods: * IsRebalanceInProgress() for status checking * GetRebalanceState() for detailed state inspection * ForceCompleteRebalance() for timeout scenarios This enables advanced rebalancing that reduces 'stop-the-world' effects by allowing consumers to incrementally give up and receive partitions during rebalancing. |
4 days ago |
|
c438e0cf33 |
Implement static membership support (group.instance.id)
- Add GroupInstanceID field to GroupMember struct - Add StaticMembers mapping to ConsumerGroup for instance ID tracking - Implement static member management methods: * FindStaticMember, RegisterStaticMember, UnregisterStaticMember * IsStaticMember for checking membership type - Update JoinGroup handler to support static membership: * Check for existing static members by instance ID * Register new static members automatically * Generate appropriate member IDs for static vs dynamic members - Update LeaveGroup handler for static member validation: * Verify GroupInstanceID matches for static members * Return FENCED_INSTANCE_ID error for mismatched instance IDs * Unregister static members on successful leave - Update DescribeGroups to return GroupInstanceID in member info - Add comprehensive tests for static membership functionality: * Basic registration and lookup * Member reconnection scenarios * Edge cases and error conditions * Concurrent access patterns Static membership enables sticky partition assignments and reduces rebalancing overhead for long-running consumers. |
4 days ago |
|
f1610b5bac |
fix: LeaveGroup v0 response format for kafka-go compatibility
- LeaveGroup v0 now sends only correlation_id + error_code (6 bytes total) - Removed throttle_time_ms and members array from v0 response - Added version-specific response builders for LeaveGroup - Matches kafka-go LeaveGroup v0 specification exactly Consumer group protocol progression: ✅ FindCoordinator v0 - fixed ✅ JoinGroup v2 - fixed (empty members array) ✅ SyncGroup v0 - working ✅ LeaveGroup v0 - fixed Next: Debug remaining 'left 61 unread bytes' error during message fetching |
4 days ago |
|
297c662191 |
Phase 7: Comprehensive error handling and edge cases
- Added centralized errors.go with complete Kafka error code definitions - Implemented timeout detection and network error classification - Enhanced connection handling with configurable timeouts and better error reporting - Added comprehensive error handling test suite with 21 test cases - Unified error code usage across all protocol handlers - Improved request/response timeout handling with graceful fallbacks - All protocol and E2E tests passing with robust error handling |
6 days ago |
|
6eafc87413 |
fix remaining tests
|
6 days ago |
|
3802106acf |
mq(kafka): Phase 3 Step 3 - Consumer Coordination
- Implement Heartbeat API (key 12) for consumer group liveness - Implement LeaveGroup API (key 13) for graceful consumer departure - Add comprehensive consumer coordination with state management: * Heartbeat validation with generation and member checks * Rebalance state signaling to consumers via heartbeat responses * Graceful member departure with automatic rebalancing trigger * Leader election when group leader leaves * Group state transitions: stable -> rebalancing -> empty * Subscription topic updates when members leave - Update ApiVersions to advertise 13 APIs total (was 11) - Complete test suite with 12 new test cases covering: * Heartbeat success, rebalance signaling, generation validation * Member departure, leader changes, empty group handling * Error conditions (unknown member, wrong generation, invalid group) * End-to-end coordination workflows * Request parsing and response building - All integration tests pass with updated API count (13 APIs) - E2E tests show '96 bytes' response (increased from 84 bytes) This completes Phase 3 consumer group implementation, providing full distributed consumer coordination compatible with Kafka client libraries. Consumers can now join groups, coordinate partitions, commit offsets, send heartbeats, and leave gracefully with automatic rebalancing. |
1 week ago |