You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 

9.1 KiB

Kafka Protocol Compatibility Review & TODOs

Overview

This document identifies areas in the current Kafka implementation that need attention for full protocol compatibility, including assumptions, simplifications, and potential issues.

Critical Protocol Issues

HIGH PRIORITY - Protocol Breaking Issues

1. Record Batch Parsing (Produce API)

File: protocol/produce.go Issues:

  • parseRecordSet() uses simplified parsing logic that doesn't handle the full Kafka record batch format
  • Hardcoded assumptions about record batch structure
  • Missing compression support (gzip, snappy, lz4, zstd)
  • CRC validation is completely missing

TODOs:

// TODO: Implement full Kafka record batch parsing
// - Support all record batch versions (v0, v1, v2)
// - Handle compression codecs (gzip, snappy, lz4, zstd)
// - Validate CRC32 checksums
// - Parse individual record headers, keys, values, timestamps
// - Handle transaction markers and control records

2. Request Parsing Assumptions

Files: protocol/offset_management.go, protocol/joingroup.go, protocol/consumer_coordination.go Issues:

  • Most parsing functions have hardcoded topic/partition assumptions
  • Missing support for array parsing (topics, partitions, group protocols)
  • Simplified request structures that don't match real Kafka clients

TODOs:

// TODO: Fix OffsetCommit/OffsetFetch request parsing
// Currently returns hardcoded "test-topic" with partition 0
// Need to parse actual topics array from request body

// TODO: Fix JoinGroup protocol parsing
// Currently ignores group protocols array and subscription metadata
// Need to extract actual subscribed topics from consumer metadata

// TODO: Add support for batch operations
// OffsetCommit can commit multiple topic-partitions
// LeaveGroup can handle multiple members leaving

3. Fetch Record Construction

File: protocol/fetch.go Issues:

  • constructRecordBatch() creates fake record batches with dummy data
  • Varint encoding is simplified to single bytes (incorrect)
  • Missing proper record headers, timestamps, and metadata

TODOs:

// TODO: Replace dummy record batch construction with real data
// - Read actual message data from SeaweedMQ/storage
// - Implement proper varint encoding/decoding
// - Support record headers and custom timestamps
// - Handle different record batch versions correctly

MEDIUM PRIORITY - Compatibility Issues

4. API Version Support

File: protocol/handler.go Issues:

  • ApiVersions response advertises max versions but implementations may not support all features
  • No version-specific handling in most APIs

TODOs:

// TODO: Add API version validation per request
// Different API versions have different request/response formats
// Need to validate apiVersion from request header and respond accordingly

// TODO: Update handleApiVersions to reflect actual supported features
// Current max versions may be too optimistic for partial implementations

5. Consumer Group Protocol Metadata

File: protocol/joingroup.go Issues:

  • Consumer subscription extraction is hardcoded to return ["test-topic"]
  • Group protocol metadata parsing is completely stubbed

TODOs:

// TODO: Implement proper consumer protocol metadata parsing
// Consumer clients send subscription information in protocol metadata
// Need to decode consumer subscription protocol format:
// - Version(2) + subscription topics + user data

// TODO: Support multiple assignment strategies properly
// Currently only basic range/roundrobin, need to parse client preferences

6. Error Code Mapping

Files: Multiple protocol files Issues:

  • Some error codes may not match Kafka specifications exactly
  • Missing error codes for edge cases

TODOs:

// TODO: Verify all error codes match Kafka specification
// Check ErrorCode constants against official Kafka protocol docs
// Some custom error codes may not be recognized by clients

// TODO: Add missing error codes for:
// - Network errors, timeout errors
// - Quota exceeded, throttling
// - Security/authorization errors

LOW PRIORITY - Implementation Completeness

7. Connection Management

File: protocol/handler.go Issues:

  • Basic connection handling without connection pooling
  • No support for SASL authentication or SSL/TLS
  • Missing connection metadata (client host, version)

TODOs:

// TODO: Extract client connection metadata
// JoinGroup requests need actual client host instead of "unknown-host"
// Parse client version from request headers for better compatibility

// TODO: Add connection security support
// Support SASL/PLAIN, SASL/SCRAM authentication
// Support SSL/TLS encryption

8. Record Timestamps and Offsets

Files: protocol/produce.go, protocol/fetch.go Issues:

  • Simplified timestamp handling
  • Offset assignment may not match Kafka behavior exactly

TODOs:

// TODO: Implement proper offset assignment strategy
// Kafka offsets are partition-specific and strictly increasing
// Current implementation may have gaps or inconsistencies

// TODO: Support timestamp types correctly
// Kafka supports CreateTime vs LogAppendTime
// Need to handle timestamp-based offset lookups properly

9. SeaweedMQ Integration Assumptions

File: integration/seaweedmq_handler.go Issues:

  • Simplified record format conversion
  • Single partition assumption for new topics
  • Missing topic configuration support

TODOs:

// TODO: Implement proper Kafka->SeaweedMQ record conversion
// Currently uses placeholder keys/values
// Need to extract actual record data from Kafka record batches

// TODO: Support configurable partition counts
// Currently hardcoded to 1 partition per topic
// Need to respect CreateTopics partition count requests

// TODO: Add topic configuration support
// Kafka topics have configs like retention, compression, cleanup policy
// Map these to SeaweedMQ topic settings

Testing Compatibility Issues

Missing Integration Tests

TODOs:

// TODO: Add real Kafka client integration tests
// Test with kafka-go, Sarama, and other popular Go clients
// Verify producer/consumer workflows work end-to-end

// TODO: Add protocol conformance tests
// Use Kafka protocol test vectors if available
// Test edge cases and error conditions

// TODO: Add load testing
// Verify behavior under high throughput
// Test with multiple concurrent consumer groups

Protocol Version Testing

TODOs:

// TODO: Test multiple API versions
// Clients may use different API versions
// Ensure backward compatibility

// TODO: Test with different Kafka client libraries
// Java clients, Python clients, etc.
// Different clients may have different protocol expectations

Performance & Scalability TODOs

Memory Management

TODOs:

// TODO: Add memory pooling for large messages
// Avoid allocating large byte slices for each request
// Reuse buffers for protocol encoding/decoding

// TODO: Implement streaming for large record batches
// Don't load entire batches into memory at once
// Stream records directly from storage to client

Connection Handling

TODOs:

// TODO: Add connection timeout handling
// Implement proper client timeout detection
// Clean up stale connections and consumer group members

// TODO: Add backpressure handling
// Implement flow control for high-throughput scenarios
// Prevent memory exhaustion during load spikes

Immediate Action Items

Phase 4 Priority List:

  1. Fix Record Batch Parsing - Critical for real client compatibility
  2. Implement Proper Request Parsing - Remove hardcoded assumptions
  3. Add Compression Support - Essential for performance
  4. Real SeaweedMQ Integration - Move beyond placeholder data
  5. Consumer Protocol Metadata - Fix subscription handling
  6. API Version Handling - Support multiple protocol versions

Compatibility Validation:

  1. Test with kafka-go library - Most popular Go Kafka client
  2. Test with Sarama library - Alternative popular Go client
  3. Test with Java Kafka clients - Reference implementation
  4. Performance benchmarking - Compare against Apache Kafka

Protocol Standards References

  • Kafka Protocol Guide: https://kafka.apache.org/protocol.html
  • Record Batch Format: Kafka protocol v2 record format specification
  • Consumer Protocol: Group coordination and assignment protocol details
  • API Versioning: How different API versions affect request/response format

Notes on Current State

What Works Well:

  • Basic produce/consume flow for simple cases
  • Consumer group coordination state management
  • In-memory testing mode for development
  • Graceful error handling for most common cases

What Needs Work:

  • Real-world client compatibility (requires fixing parsing issues)
  • Performance under load (needs compression, streaming)
  • Production deployment (needs security, monitoring)
  • Edge case handling (various protocol versions, error conditions)

This review should be updated as protocol implementations improve and more compatibility issues are discovered.