- Add retry logic to updateLatestVersionInDirectory to handle cases where
.versions directory creation succeeds but is not immediately visible
- Add retry logic to getLatestObjectVersion for the same consistency issue
- Use 3 retries with 50ms delays to handle filer store consistency timing
- Addresses CI failures where 'filer: no entry is found in filer store'
occurs after successful directory creation
- Maintains CI debug logging to track retry attempts and outcomes
- Add detailed logging for .versions metadata updates in putVersionedObject
- Add logging for latest version resolution in getLatestObjectVersion
- Add logging for HeadObject latest version requests
- All logs use glog.V(0) with CI-DEBUG prefix for easy filtering
- Will help diagnose timing issues between object creation and retrieval in CI
Debug logs will show:
- When .versions metadata updates start and complete
- When HeadObject tries to read latest version metadata
- Race conditions if HeadObject runs before metadata update completes
- Missing metadata if .versions directory exists but metadata keys are missing
- File access issues if version files exist but can't be accessed
Different Owner: Always fails with 409 BucketAlreadyExists
Checks the s3-identity-id header against the stored bucket owner
Returns error immediately if owners don't match
Same Owner + Conflicting Settings: Fails with 409 BucketAlreadyExists
Compares requested Object Lock settings with existing bucket configuration
Returns error if settings are incompatible (e.g., trying to enable Object Lock on a bucket that doesn't have it)
Same Owner + Compatible Settings: Returns 200 OK (idempotent)
If the bucket already exists with the same owner and compatible settings
Returns success response without recreating the bucket
Skip long-polling if any requested topic does not exist.
Only long-poll when MinBytes > 0, data isn’t available yet, and all topics exist.
Cap the long-polling wait to 1s in tests to prevent hanging on shutdown.
Busy fetch loop: Implemented basic long-polling in Fetch. If no data and min_bytes>0 with max_wait_ms>0, we wait up to max_wait_ms, and populate throttle_time_ms accordingly. This stops the rapid loop for kafka-go on empty partitions.
- Added centralized errors.go with complete Kafka error code definitions
- Implemented timeout detection and network error classification
- Enhanced connection handling with configurable timeouts and better error reporting
- Added comprehensive error handling test suite with 21 test cases
- Unified error code usage across all protocol handlers
- Improved request/response timeout handling with graceful fallbacks
- All protocol and E2E tests passing with robust error handling
- Added flexible_versions.go with utilities for Kafka flexible versions (v3+)
- Implemented ParseRequestHeader for compact string parsing and tagged fields
- Added fallback mechanism in handler.go for backward compatibility
- Updated handleApiVersions to support flexible version responses
- Added comprehensive tests for flexible version utilities
- All protocol tests passing with robust error handling
Multi-batch Fetch support completed:
## Core Features
- **MaxBytes compliance**: Respects fetch request MaxBytes limits to prevent oversized responses
- **Multi-batch concatenation**: Properly concatenates multiple record batches in single response
- **Size estimation**: Pre-estimates batch sizes to optimize MaxBytes usage before construction
- **Kafka-compliant behavior**: Always returns at least one batch even if it exceeds MaxBytes (first batch rule)
## Implementation Details
- **MultiBatchFetcher**: New dedicated class for multi-batch operations
- **Intelligent batching**: Adapts record count per batch based on available space (10-50 records)
- **Proper concatenation format**: Each batch maintains independent headers and structure
- **Fallback support**: Graceful fallback to single batch if multi-batch fails
## Advanced Features
- **Compression ready**: Basic support for compressed record batches (GZIP placeholder)
- **Size tracking**: Tracks total response size and batch count across operations
- **Edge case handling**: Handles large single batches, empty responses, partial batches
## Integration & Testing
- **Fetch API integration**: Seamlessly integrated with existing handleFetch pipeline
- **17 comprehensive tests**: Multi-batch scenarios, size limits, concatenation format validation
- **E2E compatibility**: Sarama tests pass with no regressions
- **Performance validation**: Benchmarks for batch construction and multi-fetch operations
## Performance Improvements
- **Better bandwidth utilization**: Fills available MaxBytes space efficiently
- **Reduced round trips**: Multiple batches in single response
- **Adaptive sizing**: Smaller batches when space limited, larger when space available
Ready for Phase 6: Basic flexible versions support