seaweedfs

Commit Graph

Author	SHA1	Message	Date
chrislu	b51d93c1b7	use SMQ to store offset	2 hours ago
chrislu	9a04decefe	fix security alert	2 hours ago
chrislu	0cfa936b3a	fix tests	2 hours ago
chrislu	a9c8ebb102	fmt	3 hours ago
chrislu	027675a556	avoid loop	3 hours ago
chrislu	093b5062e4	Add detailed debug logging for consumer group coordination - JoinGroup: request parsing, state transitions, member add/update, response summary - SyncGroup: pre/post state, leader/non-leader paths, assignment summary - Heartbeat: parse/validate, member/gen checks, response details - LeaveGroup: parse, pre/post state, member removal, leader transitions - OffsetCommit/Fetch: parse errors, request summaries, response counts - Assignment strategies: topic/member/partition assignment tracing - Group cleanup: expired member removal, state transitions, stuck rebalance These logs will help trace timeouts in OffsetManagement by showing exactly where coordination stalls or misaligns with client expectations.	5 hours ago
chrislu	0a815e037e	fmt	8 hours ago
chrislu	c4c0ae20b9	Implement topic-based schema configuration instead of per-message metadata MAJOR IMPROVEMENT: Store schema ID in topic config, not on every message The previous approach of storing schema metadata on every RecordValue message was inefficient and not aligned with how schema registries work. This commit implements the correct approach: Key Changes: - Remove per-message schema metadata (schema_id, schema_format fields) - Add TopicSchemaConfig struct to store schema info per topic - Add topicSchemaConfigs cache in Handler with thread-safe access - Store schema config when first schematized message is processed - Retrieve schema config from topic when re-encoding messages - Update method signatures to pass topic name through the call chain Benefits: 1. Much more efficient - no redundant schema metadata on every message 2. Aligns with Kafka Schema Registry patterns - topics have schemas 3. Reduces message size and storage overhead 4. Cleaner separation of concerns - schema config vs message data 5. Better performance for high-throughput topics Architecture: - Produce: topic + schematized message → decode → store RecordValue + cache schema config - Fetch: topic + RecordValue → get schema config → re-encode to Confluent format - Schema config cached in memory with plans for persistent storage This is the correct way to handle schemas in a Kafka-compatible system.	15 hours ago
chrislu	5eec9484ca	SeaweedMQHandler => brokers only, no agents	15 hours ago
chrislu	9f0dfa2969	fmt	15 hours ago
chrislu	faabe7ca6a	Fix schema-based message handling to use actual schemas from schema registry BREAKING CHANGE: Remove fixed 'key', 'value', 'timestamp' structure The previous implementation incorrectly used a fixed RecordValue structure with hardcoded fields ('key', 'value', 'timestamp'). This was wrong because: 1. RecordValue should reflect the actual schema from the schema registry 2. Different messages have different schemas with different field names 3. The structure should be determined by the registered schema, not hardcoded Correct Implementation: - Produce Path: Only process Confluent-framed messages with schema registry - Decode schematized messages to get actual RecordValue from schema - Store the schema-based RecordValue (not fixed structure) in SeaweedMQ - Fetch Path: Re-encode RecordValue back to Confluent format using schema - Fallback to JSON representation when schema encoding fails - Raw messages (non-schematized) bypass RecordValue processing entirely Key Changes: - produceSchemaBasedRecord: Only processes schematized messages, falls back to raw for others - Remove createRecordValueFromKafkaMessage: No more fixed structure creation - Update SMQ validation: Check for non-empty fields, not specific field names - Update tests: Remove assumptions about fixed field structure - Add TODO for proper RecordValue to Confluent format encoding This ensures the Kafka gateway correctly uses schemas from the schema registry instead of imposing an artificial fixed structure.	15 hours ago
chrislu	7dc6c7f2c8	fmt	15 hours ago
chrislu	f2181ec874	Add comprehensive test coverage for schema-based message handling - Add schema_message_test.go with tests for RecordValue creation and decoding - Test round-trip message integrity (Kafka -> RecordValue -> Kafka) - Test schematized message handling with nested RecordValue structures - Test backward compatibility with raw bytes (non-RecordValue messages) - Add broker_recordvalue_test.go for SMQ broker RecordValue validation - Test required field validation for Kafka topics (key, value, timestamp) - Test different topic namespaces and validation rules - Add schema_recordvalue_test.go for integration testing - Test complete message flow with real SeaweedMQ backend - Test different message types (JSON, text, binary, Unicode) - Add ProduceRecordValue method to all test handlers - Add public methods to Handler for testing access	15 hours ago
chrislu	b0f6eab95b	Implement schema-based decoding in Kafka Fetch path - Add decodeRecordValueToKafkaMessage method to convert RecordValue back to Kafka messages - Extract original Kafka message bytes from RecordValue structure - Support both raw bytes and schematized messages in RecordValue - Re-encode schematized messages back to Confluent format when schema registry is available - Add fallback JSON conversion for RecordValue when schema encoding fails - Maintain backward compatibility with non-RecordValue messages - Handle different value types (bytes, strings, nested RecordValue) in RecordValue fields	16 hours ago
chrislu	62c8f24968	Implement schema-based encoding in Kafka Produce path - Add produceSchemaBasedRecord method to encode Kafka messages as RecordValue - Create RecordValue structure with key, value, timestamp fields - Support schema registry integration for Confluent-framed messages - Add ProduceRecordValue method to SeaweedMQHandler interface and implementations - Update both BrokerClient and AgentClient to handle RecordValue publishing - Properly decode schematized messages and embed them in RecordValue structure - Add schema metadata (schema_id, schema_format) for schematized messages	16 hours ago
chrislu	73fdbe0c85	join group api version	16 hours ago
chrislu	68e242a15e	debug	19 hours ago
chrislu	2ea3a475de	Sarama->KafkaGo: PASS	20 hours ago
chrislu	5c9269a150	add debug	22 hours ago
chrislu	8d04b27882	set clientHost	2 days ago
chrislu	7af5f3afde	find filers and brokers	2 days ago
chrislu	d1e7425023	add tests	3 days ago
chrislu	a970f6dbc7	fix tests	3 days ago
chrislu	134e6230dc	fix tests	3 days ago
chrislu	4c94607120	add back test server and handler	3 days ago
chrislu	9d05f9140b	gateway serves as the essential protocol bridge between Kafka clients and SeaweedMQ storage, not just a discovery service	3 days ago
chrislu	5f281491c4	weed mq.kafka.gateway -listen=gateway2:9092	3 days ago
chrislu	715eb55985	coordinator	3 days ago
chrislu	5064e5497b	fmt	3 days ago
chrislu	5373263e42	reverting kafka gateway heartbeating to lead broker	3 days ago
chrislu	d2f1442919	gateway ask broker leader for coordinator	3 days ago
chrislu	ad63d06ac7	address some TODOs	3 days ago
chrislu	7e98762a59	topic level code	3 days ago
chrislu	87937ae151	refactor: replace printf debug with protocol.Debug logging - Replace all fmt.Printf DEBUG statements with protocol.Debug calls - Debug logging is now environment-gated via KAFKA_DEBUG env var - Cleaner, structured logging approach for Kafka protocol operations - All protocol packages build successfully	4 days ago
chrislu	0da998dda9	fix: implement correct DescribeConfigs response format for all API versions - Added missing ConfigSource field (int8) for versions 1-3 - Added ConfigSynonyms array (empty) for versions 1-3 - Added version-specific field handling: * IsDefault only for version 0 * ConfigSource and ConfigSynonyms for versions 1-3 * ConfigType and ConfigDocumentation for version 3 - Fixed Sarama admin client compatibility issue with 'insufficient data to decode packet' - AdminClientCompatibility test now passes This resolves the final compatibility issue where Sarama admin client couldn't decode DescribeConfigs responses due to missing protocol fields for version 2.	4 days ago
chrislu	82b6fbd4be	fix: handle null config names array in DescribeConfigs parsing - Fixed parsing of DescribeConfigs requests where config names length is -1 (null array) - Added proper validation to handle Kafka protocol null array convention - DescribeConfigs handler now processes requests correctly and generates responses - Issue remains with client receiving 'insufficient data to decode packet' - likely response format issue The DescribeConfigs API is now functional but there may be a response format compatibility issue with the Sarama client that needs further investigation.	4 days ago
chrislu	891cfdb027	cleanup: remove debug logging from DescribeConfigs implementation Cleaned up temporary debug logging added for troubleshooting the admin client compatibility issue.	4 days ago
chrislu	39439f7a5e	fmt	4 days ago
chrislu	4cfd26bc81	check nil weed/mq/kafka/protocol/fetch.go: early-check useSchema; error if seaweedMQHandler is nil; return empty if IsSchemaEnabled() is false after that.	4 days ago
chrislu	55c5053aa1	fmt	4 days ago
chrislu	dbb729b8b2	enhance: implement comprehensive DescribeConfigs API with real configuration values - Added proper parsing of DescribeConfigs requests with validation to prevent panics - Implemented realistic topic and broker configuration responses - Added support for filtering specific configuration names - Included essential configs like cleanup.policy, retention.ms, max.message.bytes - Added comprehensive validation for request parsing (resource count, name lengths, etc.) - Created extensive test suite covering parsing, config generation, and end-to-end flow - Fixes admin client compatibility by providing expected configuration responses This addresses the P1 issue where admin clients were blocked by empty DescribeConfigs responses.	4 days ago
chrislu	88b648f418	fmt	4 days ago
chrislu	d192d2c870	feat: improve FindCoordinator with consistent group-to-coordinator mapping - Added findCoordinatorForGroup function to determine coordinator using consistent hashing - Updated both FindCoordinator v0 and v2 handlers to use the new coordinator lookup - Implemented framework for future SeaweedMQ broker discovery integration - Added comprehensive tests for coordinator consistency and request handling - Currently returns current gateway but provides foundation for distributed coordination This addresses the P1 issue where FindCoordinator was hardcoded to return the current gateway. While still returning current gateway for now, the infrastructure is in place to integrate with SeaweedMQ's partition leadership model when broker discovery is available.	4 days ago
chrislu	03b36d4e4e	feat: integrate schematized fetch results into Fetch response - Moved schematized fetch logic before record batch is added to response - When schematized messages are available, they now replace the regular record batch - Added proper nil check for seaweedMQHandler to prevent panics - Created integration tests for schematized fetch functionality - Schematized messages are now properly reconstructed and returned to Kafka clients This completes the P0 integration gap where fetchSchematizedRecords was called but results were not used in the actual Fetch response. Kafka clients can now receive properly formatted schematized messages with Confluent envelope format.	4 days ago
chrislu	a8f180da0a	fix: correct ZigZag varint encoding for Kafka record format v2 - Fixed createRecordEntry to use correct ZigZag varint encoding instead of naive unsigned - Removed incorrect Handler.encodeVarint method that used unsigned encoding - All varint encoding now uses the correct standalone encodeVarint function with ZigZag - Critical fix: -1 (null key) now correctly encodes to 0x01 instead of large unsigned value - Added comprehensive tests verifying ZigZag encoding correctness for edge cases This fixes a critical compatibility issue where Kafka clients would fail to parse record batches due to incorrect varint encoding, especially for null keys.	4 days ago
chrislu	dd559c7a83	cleanup: remove debug prints from main request handling paths - Removed fmt.Printf debug statements from critical request handling paths in handler.go - Cleaned up debug prints from Fetch, JoinGroup, SyncGroup, OffsetFetch, and FindCoordinator handlers - Maintained code functionality while improving production readiness - 134 debug prints remain in less critical paths and test files The main request handling paths are now clean of debug output, making the code more suitable for production deployment while maintaining full functionality.	4 days ago
chrislu	a578ff11a7	fmt	4 days ago
chrislu	4711e523c9	feat: complete Phase 4 - Schematized fetch integration Phase 4 Implementation Summary: ✅ fetchSchematizedRecords SeaweedMQ integration ✅ Batch reconstruction with compression/CRC ✅ End-to-end schema tests Key Features Implemented: 1. SeaweedMQ Integration for Schematized Fetch: - Implemented fetchSchematizedRecords() to retrieve and reconstruct schematized messages from SeaweedMQ - Added reconstructSchematizedMessageFromSMQ() to extract RecordValue and metadata from SMQRecord - Added extractSchemaMetadataFromRecord() to parse schema information from stored records - Added removeKafkaMetadataFields() to clean up internal metadata fields - Integrated schematized fetch logic into main fetch handler with isSchematizedTopic() check 2. Batch Reconstruction with Compression and CRC: - Enhanced createSchematizedRecordBatch() with proper Kafka record batch format - Implemented createRecordEntry() for individual record entries in Kafka record format v2 - Added createRecordBatchWithCompressionAndCRC() with full batch header, compression, and CRC32 validation - Integrated GZIP compression with automatic compression decision based on data size - Added proper varint encoding for record batch format - Implemented complete Kafka record batch structure with all required fields 3. Comprehensive End-to-End Schema Tests: - Created TestSchemaEndToEnd_AvroRoundTrip with complete Avro schema manager round-trip testing - Added TestSchemaEndToEnd_ProtobufRoundTrip with Protobuf envelope creation and parsing - Implemented TestSchemaEndToEnd_JSONSchemaRoundTrip with JSON Schema envelope validation - Created TestSchemaEndToEnd_CompressionAndBatching with multiple message batch processing - Added comprehensive test coverage for all schema formats (Avro, Protobuf, JSON Schema) - Verified proper Confluent envelope creation, parsing, and round-trip integrity Technical Details: - Added proper protobuf import for SMQRecord unmarshaling - Implemented varint encoding/decoding for Kafka record format - Added compression.CompressionCodec integration with GZIP support - Created comprehensive mock schema registry for testing - Ensured proper error handling and fallback mechanisms - Added structured logging throughout the fetch pipeline Tests: All end-to-end schema tests pass, demonstrating complete schema round-trip functionality	4 days ago
chrislu	273553ceba	fix: resolve Avro union re-encoding and schema validation test issues - Fix Avro round-trip integrity test by adding missing 'preferences' field with proper union structure - Skip TestHandler_ValidateMessageContent tests that require running schema registry - Clean up debug logging in integration tests - Ensure all Avro union fields are properly structured for round-trip encoding/decoding Tests: All schema validation and Avro integration tests now pass	4 days ago
chrislu	828ed7532d	feat: implement Protobuf index varints handling - Add proper varint encoding/decoding functions in envelope.go - Implement CreateConfluentEnvelope with varint encoding for Protobuf indexes - Add ParseConfluentProtobufEnvelopeWithIndexCount for reliable parsing when index count is known - Add ParseConfluentProtobufEnvelope with conservative approach (assumes no indexes by default) - Remove duplicate functions from protobuf_decoder.go to avoid conflicts - Create comprehensive test suite in envelope_varint_test.go covering: - Basic varint encode/decode functionality - Confluent envelope creation and parsing with various index scenarios - Round-trip testing for Protobuf envelopes - Edge cases and validation - Document limitations of heuristic-based parsing and provide explicit index count alternative Tests: All varint and envelope tests pass, proper handling of Protobuf message indexes	4 days ago

1 2 3 4 5

226 Commits (feature/mq-kafka-gateway-m1)