Phase 3 Continued: Early Channel Closure Detection
Added detection and logging for when Sarama's claim.Messages() channel
closes prematurely (indicating broker stream termination):
Changes:
- consumer.go: Distinguish between normal and abnormal channel closures
- Mark partitions that close after < 10 messages as CRITICAL
- Shows last consumed offset vs HWM when closed early
Current Test Results:
Delivery: 84-87.5% (1974-2055 / 2350-2349)
Missing: 12.5-16% (294-376 messages)
Duplicates: 0 ✅
Errors: 0 ✅
Pattern: 2-3 partitions receive only 1-10 messages then channel closes
Suggests: Broker or middleware prematurely closing subscription
Key Observations:
- Most (13/15) partitions work perfectly
- Remaining issue is repeatable on same 2-3 partitions
- Messages() channel closes after initial messages
- Could be:
* Broker connection reset
* Fetch request error not being surfaced
* Offset commit failure
* Rebalancing triggered prematurely
Next Investigation:
- Add Sarama debug logging to see broker errors
- Check if fetch requests are returning errors silently
- Monitor offset commits on affected partitions
- Test with longer-running consumer
From 0% → 84-87.5% is EXCELLENT PROGRESS.
Remaining 12.5-16% is concentrated on reproducible partitions.