seaweedfs

Commit Graph

Author	SHA1	Message	Date
chrislu	b0ae15eac7	debug	6 days ago
chrislu	1904ba93a8	update	6 days ago
chrislu	5895f22d69	context is cancelled, the server will detect it immediately and exit gracefully	6 days ago
chrislu	ddbbbaa1dd	debug	6 days ago
chrislu	f5871524be	add debug messages	6 days ago
chrislu	8e69446112	fix tests	6 days ago
chrislu	93c3e0c784	fix tests	6 days ago
chrislu	2f040a0fe4	fix undefined method errors	6 days ago
chrislu	9a4ad5047b	Update handler.go Busy fetch loop: Implemented basic long-polling in Fetch. If no data and min_bytes>0 with max_wait_ms>0, we wait up to max_wait_ms, and populate throttle_time_ms accordingly. This stops the rapid loop for kafka-go on empty partitions.	6 days ago
chrislu	297c662191	Phase 7: Comprehensive error handling and edge cases - Added centralized errors.go with complete Kafka error code definitions - Implemented timeout detection and network error classification - Enhanced connection handling with configurable timeouts and better error reporting - Added comprehensive error handling test suite with 21 test cases - Unified error code usage across all protocol handlers - Improved request/response timeout handling with graceful fallbacks - All protocol and E2E tests passing with robust error handling	6 days ago
chrislu	2e2ccbf488	Phase 6: Add basic flexible versions support - Added flexible_versions.go with utilities for Kafka flexible versions (v3+) - Implemented ParseRequestHeader for compact string parsing and tagged fields - Added fallback mechanism in handler.go for backward compatibility - Updated handleApiVersions to support flexible version responses - Added comprehensive tests for flexible version utilities - All protocol tests passing with robust error handling	6 days ago
chrislu	7149b723ec	Phase 4: Implement consumer group protocol metadata parsing Consumer Group Protocol Metadata completed: ## Core Enhancements - ClientHost extraction: Real client IP/host instead of hardcoded 'unknown' - ExtractClientHost() extracts IP from connection context - Populates GroupMember.ClientHost with actual remote address - Enhanced protocol metadata parsing: Robust parsing with error handling - ParseConsumerProtocolMetadata() with validation and graceful fallbacks - Handles malformed metadata, oversized fields, and edge cases - Improved assignment strategy selection: Priority-based protocol selection - SelectBestProtocol() prefers sticky > roundrobin > range - Considers both client capabilities and existing group protocols ## Implementation Details - Connection Context: Added ConnectionContext to Handler for client info - Metadata Analysis: AnalyzeProtocolMetadata() for detailed debugging - Enhanced Subscription Extraction: ExtractTopicsFromMetadata() with fallbacks - Validation: SanitizeConsumerGroupID() prevents malformed group IDs - Graceful Error Handling: Invalid metadata handled without failures ## New Files - : Core metadata parsing and client context logic - : Comprehensive test suite (17 test cases) ## Integration - JoinGroup enhancement: Uses real client host and robust metadata parsing - Backward compatibility: Legacy methods maintained for compatibility - Debug improvements: Enhanced logging shows parsed protocol details ## Testing & Verification - 17 comprehensive tests: Protocol parsing, client host extraction, strategy selection - Edge case coverage: Empty metadata, malformed data, oversized fields - E2E compatibility: Sarama tests pass, no regressions - Performance validation: Benchmark tests for parsing operations Ready for Phase 5: Multi-batch Fetch concatenation support	6 days ago
chrislu	71769da3b4	Phase 3: Fix ApiVersions matrix accuracy and version validation ApiVersions Matrix Accuracy completed: ## Critical Fixes - OffsetFetch API: Updated advertised from v0-v2 to v0-v5 (MAJOR fix) - Implementation already supported v3+ throttle_time_ms and v5+ leader_epoch - Clients can now use advanced OffsetFetch features - CreateTopics API: Updated advertised from v0-v4 to v0-v5 (minor fix) - Implementation already routed v5 requests to v2+ handler - Better client compatibility for v5 CreateTopics requests ## Implementation - handleApiVersions(): Corrected advertised max versions - validateAPIVersion(): Updated validation ranges to match advertisements - Consistency: Eliminated mismatch between advertised vs implemented versions ## Testing & Verification - Comprehensive test suite: 6 new tests in api_versions_test.go - Version validation tests: OffsetFetch v3-v5 and CreateTopics v5 now accepted - End-to-end verification: E2E tests still pass, no regressions - API audit documentation: Complete version matrix in API_VERSION_MATRIX.md ## Impact - Client compatibility: Higher-version clients can now connect properly - Feature utilization: Advanced features like leader epoch, throttle time accessible - Protocol compliance: Advertised versions now match actual implementation - Future-proofing: Clear process for managing API version accuracy Ready for Phase 4: Consumer group protocol metadata parsing	6 days ago
chrislu	5d0c45c9dc	Phase 2: Implement CreateTopics protocol compliance for v0/v1 CreateTopics Protocol Compliance completed: ## Implementation - Implement handleCreateTopicsV0V1() with proper v0/v1 request parsing - Support regular array/string format (not compact) for v0/v1 - Parse topic name, partitions, replication factor, assignments, configs - Handle timeout_ms and validate_only fields correctly - Maintain existing v2+ compact format support - Wire to SeaweedMQ handler for actual topic creation ## Key Features - Full v0-v5 CreateTopics API version support - Proper error handling (TOPIC_ALREADY_EXISTS, INVALID_PARTITIONS, etc.) - Partition count validation and enforcement - Compatible with existing SeaweedMQ topic management ## Tests - Comprehensive unit tests for v0/v1/v2+ parsing - Error condition testing (duplicate topics, invalid partitions) - Multi-topic creation support - Integration tests across all API versions - Performance benchmarks for CreateTopics operations ## Verification - All protocol tests pass (v0-v5 CreateTopics) - E2E Sarama tests continue to work - Real topics created with specified partition counts - Proper error responses for edge cases Ready for Phase 3: ApiVersions matrix accuracy	6 days ago
chrislu	c0b15ed489	refactoring	6 days ago
chrislu	964d1d06e4	fix TestSaramaProduceConsume	6 days ago
chrislu	42aea1dc68	align package decoding	6 days ago
chrislu	445d7343d7	fix v7 samara	6 days ago
chrislu	ba1599b1e9	fix tests	7 days ago
chrislu	8743c5453a	clean up connections	7 days ago
chrislu	9ea6ef0bf8	fix tests	7 days ago
chrislu	ccd48feefb	fix test errors	7 days ago
chrislu	e6f7e7efb5	fix in-memory variables	7 days ago
chrislu	56aa5278af	single mode	7 days ago
chrislu	56ba8ce219	Phase 3: Add comprehensive integration tests - Add end-to-end flow tests for Kafka OffsetCommit to SMQ storage - Test multiple consumer groups with independent offset tracking - Validate SMQ file path and format compatibility - Test error handling and edge cases (negative, zero, max offsets) - Verify offset encoding/decoding matches SMQ broker format - Ensure consumer group isolation and proper key generation	7 days ago
chrislu	ac436eac94	Phase 2: Wire OffsetCommit/OffsetFetch to SMQ storage - Update Kafka protocol handler to use SMQOffsetStorage for consumer offsets - Modify OffsetCommit to save consumer offsets using SMQ's filer format - Modify OffsetFetch to read consumer offsets from SMQ's filer location - Add proper ConsumerOffsetKey creation with consumer group and instance ID - Maintain backward compatibility with in-memory storage fallback - Include comprehensive test coverage for offset handler integration	7 days ago
chrislu	969ca60b6f	change to connect to mq brokers instead of agents	1 week ago
chrislu	4e4e3ce1a8	tests: align ApiVersions test expectations with advertised ranges (ListOffsets v0-2, Fetch v0-7)	1 week ago
chrislu	a5f330ad17	kafka protocol: align advertised and validated API version ranges with implemented handlers (Fetch<=v7, ListOffsets<=v2, FindCoordinator<=v2, OffsetCommit/OffsetFetch<=v2); keep Metadata<=v7, JoinGroup<=v7, SyncGroup<=v5	1 week ago
chrislu	7790155827	kafka gateway: strip client_id in header; align handlers with spec; fix ApiVersions count; correct Metadata/ListOffsets v0 tests; robust Produce v2+ parsing (transactional_id fallback, acks=0 empty response, unknown topic errors); relax record set/test extraction; fix OffsetCommit/OffsetFetch parsing and tests; Fetch returns UNKNOWN_TOPIC_OR_PARTITION for missing topic	1 week ago
chrislu	48a0b49880	protocol: align request parsing with Kafka specs; remove client_id skips; revert OffsetFetch v0-v5 to classic encodings; adjust FindCoordinator parsing; update ApiVersions Metadata max v7; fix tests to pass apiVersion and expectations	1 week ago
chrislu	56608aead3	feat: major consumer group breakthrough - fix FindCoordinator v2 and JoinGroup v5 🎉 MAJOR PROGRESS: - Fixed FindCoordinator v2 response format (added throttle_time, error_code, error_message, node_id) - Fixed JoinGroup v5 request parsing (added GroupInstanceID field parsing) - Consumer group coordination now working: FindCoordinator -> JoinGroup -> SyncGroup - Sarama consumer successfully joins group, gets member ID, calls Setup handler ✅ Working: - FindCoordinator v2: Sarama finds coordinator successfully - JoinGroup v5: Consumer joins group, gets generation 1, member ID assigned - Consumer group session setup called with generation 1 ❌ Current issue: - SyncGroup v3 parsing error: 'invalid member ID length' - Consumer has no partition assignments (Claims: map[]) - Need to fix SyncGroup parsing to complete consumer group flow Next: Fix SyncGroup v3 parsing to enable partition assignment and message consumption	1 week ago
chrislu	687eaddedd	debug: add comprehensive consumer group tests and identify FindCoordinator issue - Created consumer group tests for basic functionality, offset management, and rebalancing - Added debug test to isolate consumer group coordination issues - Root cause identified: Sarama repeatedly calls FindCoordinator but never progresses to JoinGroup - Issue: Connections closed after FindCoordinator, preventing coordinator protocol - Consumer group implementation exists but not being reached by Sarama clients Next: Fix coordinator connection handling to enable JoinGroup protocol	1 week ago
chrislu	5ec751e2e3	feat: fix Sarama consumer compatibility by correcting record batch base offsets 🎉 MAJOR SUCCESS: Both kafka-go and Sarama now fully working! Root Cause: - Individual message batches (from Sarama) had base offset 0 in binary data - When Sarama requested offset 1, it received batch claiming offset 0 - Sarama ignored it as duplicate, never got actual message 1,2 Solution: - Correct base offset in record batch header during StoreRecordBatch - Update first 8 bytes (base_offset field) to match assigned offset - Each batch now has correct internal offset matching storage key Results: ✅ kafka-go: 3/3 produced, 3/3 consumed ✅ Sarama: 3/3 produced, 3/3 consumed Both clients now have full produce-consume compatibility	1 week ago
chrislu	491404b3f6	debug: add detailed logging for Sarama Fetch v5 issue - Added hex dump of record batch content for each offset - Confirmed we're returning different batches correctly (98 bytes each) - Sarama requests offsets 0,1,2 individually but only consumes offset 0 - Issue identified: Fetch v5 (Sarama) vs v10 (kafka-go) response format difference - kafka-go: fully working, Sarama: 1/3 messages consumed Next: Investigate Fetch v5 response format requirements	1 week ago
chrislu	7f9bc31a23	chore: clean up debug messages after kafka-go fix - Removed debug hex dumps and API request logging - kafka-go now fully functional: produces and consumes 3/3 messages - Sarama partially working: produces 3/3, consumes 1/3 messages - Issue identified: Sarama gets stuck after first message in record batch Next: Debug Sarama record batch parsing to consume all messages	1 week ago
chrislu	bab10b6c26	fmt	1 week ago
chrislu	0670ea4690	fix: correct ListOffsets v1 request parsing for kafka-go compatibility - Fixed ListOffsets v1 to parse replica_id field (present in v1+, not v2+) - Fixed ListOffsets v1 response format - now 55 bytes instead of 64 - kafka-go now successfully passes ListOffsets and makes Fetch requests - Identified next issue: Fetch response format has incorrect topic count Progress: kafka-go client now progresses to Fetch API but fails due to Fetch response format mismatch.	1 week ago
chrislu	014db6f999	fix: correct ListOffsets v1 response format for kafka-go compatibility - Fixed throttle_time_ms field: only include in v2+, not v1 - Reduced kafka-go 'unread bytes' error from 60 to 56 bytes - Added comprehensive API request debugging to identify format mismatches - kafka-go now progresses further but still has 56 bytes format issue in some API response Progress: kafka-go client can now parse ListOffsets v1 responses correctly but still fails before making Fetch requests due to remaining API format issues.	1 week ago
chrislu	6c19e548d3	feat: implement working Kafka consumer functionality with stored record batches - Fixed Produce v2+ handler to properly store messages in ledger and update high water mark - Added record batch storage system to cache actual Produce record batches - Modified Fetch handler to return stored record batches instead of synthetic ones - Consumers can now successfully fetch and decode messages with correct CRC validation - Sarama consumer successfully consumes messages (1/3 working, investigating offset handling) Key improvements: - Produce handler now calls AssignOffsets() and AppendRecord() correctly - High water mark properly updates from 0 → 1 → 2 → 3 - Record batches stored during Produce and retrieved during Fetch - CRC validation passes because we return exact same record batch data - Debug logging shows 'Using stored record batch for offset X' TODO: Fix consumer offset handling when fetchOffset == highWaterMark	1 week ago
chrislu	0bb866e57c	fmt	1 week ago
chrislu	ec1317b910	cleanup: remove prominent debug messages from kafka protocol handlers - Removed connection establishment debug messages - Removed API request/response logging that cluttered test output - Removed metadata advertising debug messages - Kept functional error handling and informational messages - Tests still pass with cleaner output The kafka-go writer test now shows much cleaner output while maintaining full functionality.	1 week ago
chrislu	aecc020b14	fix: kafka-go writer compatibility and debug cleanup - Fixed kafka-go writer metadata loop by addressing protocol mismatches: * ApiVersions v0: Removed throttle_time field that kafka-go doesn't expect * Metadata v1: Removed correlation ID from response body (transport handles it) * Metadata v0: Fixed broker ID consistency (node_id=1 matches leader_id=1) * Metadata v4+: Implemented AllowAutoTopicCreation flag parsing and auto-creation * Produce acks=0: Added minimal success response for kafka-go internal state updates - Cleaned up debug messages while preserving core functionality - Verified kafka-go writer works correctly with WriteMessages completing in ~0.15s - Added comprehensive test coverage for kafka-go client compatibility The kafka-go writer now works seamlessly with SeaweedFS Kafka Gateway.	1 week ago
chrislu	bfe15f970b	Fix kafka-go compatibility: - ApiVersions v0 response: remove unsupported throttle_time field - Metadata v1: include correlation ID (kafka-go transport expects it after size) - Metadata v1: ensure broker/partition IDs consistent and format correct Validated: - TestMetadataV6Debug passes (kafka-go ReadPartitions works) - Sarama simple producer unaffected Root cause: correlation ID handling differences and extra footer in ApiVersions.	1 week ago
chrislu	edeb922749	Remove correlation ID from Metadata v1 response for kafka-go compatibility PARTIAL FIX: Remove correlation ID from response struct for kafka-go transport layer ## Root Cause Analysis: - kafka-go handles correlation ID at transport layer (protocol/roundtrip.go) - kafka-go ReadResponse() reads correlation ID separately from response struct - Our Metadata responses included correlation ID in struct, causing parsing errors - Sarama vs kafka-go handle correlation IDs differently ## Changes: - Removed correlation ID from Metadata v1 response struct - Added comment explaining kafka-go transport layer handling - Response size reduced from 92 to 88 bytes (4 bytes = correlation ID) ## Status: - ✅ Correlation ID issue partially fixed - ❌ kafka-go still fails with 'multiple Read calls return no data or error' - ❌ Still uses v1 instead of negotiated v4 (suggests ApiVersions parsing issue) ## Next Steps: - Investigate remaining Metadata v1 format issues - Check if other response fields have format problems - May need to fix ApiVersions response format to enable proper version negotiation This is progress toward full kafka-go compatibility.	1 week ago
chrislu	d6f688a44f	Limit Metadata API to v4 to fix kafka-go client compatibility PARTIAL FIX: Force kafka-go to use Metadata v4 instead of v6 ## Issue Identified: - kafka-go was using Metadata v6 due to ApiVersions advertising v0-v6 - Our Metadata v6 implementation has format issues causing client failures - Sarama works because it uses Metadata v4, not v6 ## Changes: - Limited Metadata API max version from 6 to 4 in ApiVersions response - Added debug test to isolate Metadata parsing issues - kafka-go now uses Metadata v4 (same as working Sarama) ## Status: - ✅ kafka-go now uses v4 instead of v6 - ❌ Still has metadata loops (deeper issue with response format) - ✅ Produce operations work correctly - ❌ ReadPartitions API still fails ## Next Steps: - Investigate why kafka-go keeps requesting metadata even with v4 - Compare exact byte format between working Sarama and failing kafka-go - May need to fix specific fields in Metadata v4 response format This is progress toward full kafka-go compatibility but more investigation needed.	1 week ago
chrislu	755346e0b1	Fix CreateTopics v2 parsing for kafka-go client compatibility CRITICAL FIX: Resolve kafka-go client CreateTopics failures ## Issues Fixed: - CreateTopics handler was missing apiVersion parameter - v2+ compact array/string format parsing was incorrect - Wrong topics count (1274981) due to parsing from incorrect offset - Response format didn't match v2+ compact format requirements ## Implementation: - Added apiVersion parameter to handleCreateTopics - Implemented proper v2+ compact format parsing: - Compact arrays: length + 1 (0 = empty, n+1 = n elements) - Compact strings: length + 1 (0 = null, n+1 = n chars) - Tagged fields support (empty for now) - Separated v0/v1 and v2+ parsing logic - Fixed response format for v2+ with compact strings and tagged fields ## Protocol Details: CreateTopics v2+ request format: - topics_array (compact) + timeout_ms(4) + validate_only(1) + tagged_fields CreateTopics v2+ response format: - correlation_id(4) + throttle_time(4) + topics_array (compact) + tagged_fields Each topic response: - name (compact string) + error_code(2) + error_message (compact nullable string) + tagged_fields ## Testing: - Compilation successful - Debug logging shows proper parsing of topic names and parameters - Should resolve kafka-go client CreateTopics API failures This fix addresses the most critical compatibility issue preventing kafka-go clients from creating topics successfully.	1 week ago
chrislu	a3f569f3b0	Phase C: Wire Produce handler to decode schema and publish RecordValue to mq.broker - Add BrokerClient integration to Handler with EnableBrokerIntegration method - Update storeDecodedMessage to use mq.broker for publishing decoded RecordValue - Add OriginalBytes field to ConfluentEnvelope for complete envelope storage - Integrate schema validation and decoding in Produce path - Add comprehensive unit tests for Produce handler schema integration - Support both broker integration and SeaweedMQ fallback modes - Add proper cleanup in Handler.Close() for broker client resources Key integration points: - Handler.EnableBrokerIntegration: configure mq.broker connection - Handler.IsBrokerIntegrationEnabled: check integration status - processSchematizedMessage: decode and validate Confluent envelopes - storeDecodedMessage: publish RecordValue to mq.broker via BrokerClient - Fallback to SeaweedMQ integration or in-memory mode when broker unavailable Note: Existing protocol tests need signature updates due to apiVersion parameter additions - this is expected and will be addressed in future maintenance.	1 week ago
chrislu	71b2615f4a	fmt	1 week ago
chrislu	7b47ad613b	Phase 4: Integrate schema decoding into Kafka Produce path - Add Schema Manager to coordinate registry, decoders, and validation - Integrate schema management into Handler with enable/disable controls - Add schema processing functions in Produce path for schematized messages - Support both permissive and strict validation modes - Include message extraction and compatibility validation stubs - Add comprehensive Manager tests with mock registry server - Prepare foundation for SeaweedMQ integration in Phase 8 This enables the Kafka Gateway to detect, decode, and process schematized messages.	1 week ago

1 2 3

143 Commits (feature/mq-kafka-gateway-m1)