# SMQ Native Offset Development Plan ## Overview Add native per-partition sequential offsets to SeaweedMQ to eliminate the need for external offset mapping and provide better interoperability with message queue protocols. ## Architecture Changes ### Data Model - Add `offset` field (int64) to each record alongside existing `ts_ns` - Offset domain: per `schema_pb.Partition` (ring range) - Offsets are strictly monotonic within a partition - Leader assigns offsets; followers replicate ### Storage - Use `_index` as hidden SQL table column for offset storage - Maintain per-partition offset counters in broker state - Checkpoint offset state periodically for recovery ## Development Phases ### Phase 1: Proto and Data Model Changes **Scope**: Update protobuf definitions and core data structures **Tasks**: 1. Update `mq_schema.proto`: - Add `offset` field to record storage format - Add offset-based `OffsetType` enums - Add `offset_value` field to subscription requests 2. Update `mq_agent.proto`: - Add `base_offset` and `last_offset` to `PublishRecordResponse` - Add `offset` field to `SubscribeRecordResponse` 3. Regenerate protobuf Go code 4. Update core data structures in broker code 5. Add offset field to SQL schema with `_index` column **Tests**: - Proto compilation tests - Data structure serialization tests - SQL schema migration tests **Deliverables**: - Updated proto files - Generated Go code - Updated SQL schema - Basic unit tests ### Phase 2: Offset Assignment Logic **Scope**: Implement offset assignment in broker **Tasks**: 1. Add `PartitionOffsetManager` component: - Track `next_offset` per partition - Assign sequential offsets to records - Handle offset recovery on startup 2. Integrate with existing record publishing flow: - Assign offsets before storage - Update `PublishRecordResponse` with offset info 3. Add offset persistence to storage layer: - Store offset alongside record data - Index by offset for efficient lookups 4. Implement offset recovery: - Load highest offset on partition leadership - Handle clean and unclean restarts **Tests**: - Offset assignment unit tests - Offset persistence tests - Recovery scenario tests - Concurrent assignment tests **Deliverables**: - `PartitionOffsetManager` implementation - Integrated publishing with offsets - Offset recovery logic - Comprehensive test suite ### Phase 3: Subscription by Offset **Scope**: Enable consumers to subscribe using offsets **Tasks**: 1. Extend subscription logic: - Support `EXACT_OFFSET` and `RESET_TO_OFFSET` modes - Add offset-based seeking - Maintain backward compatibility with timestamp-based seeks 2. Update `SubscribeRecordResponse`: - Include offset in response messages - Ensure offset ordering in delivery 3. Add offset validation: - Validate requested offsets are within valid range - Handle out-of-range offset requests gracefully 4. Implement offset-based filtering and pagination **Tests**: - Offset-based subscription tests - Seek functionality tests - Out-of-range offset handling tests - Mixed timestamp/offset subscription tests **Deliverables**: - Offset-based subscription implementation - Updated subscription APIs - Validation and error handling - Integration tests ### Phase 4: High Water Mark and Lag Calculation **Scope**: Implement native offset-based metrics **Tasks**: 1. Add high water mark tracking: - Track highest committed offset per partition - Expose via broker APIs - Update on successful replication 2. Implement lag calculation: - Consumer lag = high_water_mark - consumer_offset - Partition lag metrics - Consumer group lag aggregation 3. Add offset-based monitoring: - Partition offset metrics - Consumer position tracking - Lag alerting capabilities 4. Update existing monitoring integration **Tests**: - High water mark calculation tests - Lag computation tests - Monitoring integration tests - Metrics accuracy tests **Deliverables**: - High water mark implementation - Lag calculation logic - Monitoring integration - Metrics and alerting ### Phase 5: Kafka Gateway Integration **Scope**: Update Kafka gateway to use native SMQ offsets **Tasks**: 1. Remove offset mapping layer: - Delete `kafka-system/offset-mappings` topic usage - Remove `PersistentLedger` and `SeaweedMQStorage` - Simplify offset translation logic 2. Update Kafka protocol handlers: - Use native SMQ offsets in Produce responses - Map SMQ offsets directly to Kafka offsets - Update ListOffsets and Fetch handlers 3. Simplify consumer group offset management: - Store Kafka consumer offsets as SMQ offsets - Remove timestamp-based offset translation 4. Update integration tests: - Test Kafka client compatibility - Verify offset consistency - Test long-term disconnection scenarios **Tests**: - Kafka protocol compatibility tests - End-to-end integration tests - Performance comparison tests - Migration scenario tests **Deliverables**: - Simplified Kafka gateway - Removed offset mapping complexity - Updated integration tests - Performance improvements ### Phase 6: Performance Optimization and Production Readiness **Scope**: Optimize performance and prepare for production **Tasks**: 1. Optimize offset assignment performance: - Batch offset assignment - Reduce lock contention - Optimize recovery performance 2. Add offset compaction and cleanup: - Implement offset-based log compaction - Add retention policies based on offsets - Cleanup old offset checkpoints 3. Enhance monitoring and observability: - Detailed offset metrics - Performance dashboards - Alerting on offset anomalies 4. Load testing and benchmarking: - Compare performance with timestamp-only approach - Test under high load scenarios - Validate memory usage patterns **Tests**: - Performance benchmarks - Load testing scenarios - Memory usage tests - Stress testing under failures **Deliverables**: - Optimized offset implementation - Production monitoring - Performance benchmarks - Production deployment guide ## Implementation Guidelines ### Code Organization ``` weed/mq/ ├── offset/ │ ├── manager.go # PartitionOffsetManager │ ├── recovery.go # Offset recovery logic │ └── checkpoint.go # Offset checkpointing ├── broker/ │ ├── partition_leader.go # Updated with offset assignment │ └── subscriber.go # Updated with offset support └── storage/ └── offset_store.go # Offset persistence layer ``` ### Testing Strategy - Unit tests for each component - Integration tests for cross-component interactions - Performance tests for offset assignment and recovery - Compatibility tests with existing SMQ features - End-to-end tests with Kafka gateway ### Commit Strategy - One commit per completed task within a phase - All tests must pass before commit - No binary files in commits - Clear commit messages describing changes ### Rollout Plan 1. Deploy to development environment after Phase 2 2. Integration testing after Phase 3 3. Performance testing after Phase 4 4. Kafka gateway migration after Phase 5 5. Production rollout after Phase 6 ## Success Criteria ### Phase Completion Criteria - All tests pass - Code review completed - Documentation updated - Performance benchmarks meet targets ### Overall Success Metrics - Eliminate external offset mapping complexity - Maintain or improve performance - Full Kafka protocol compatibility - Native SMQ offset support for all protocols - Simplified consumer group offset management ## Risk Mitigation ### Technical Risks - **Offset assignment bottlenecks**: Implement batching and optimize locking - **Recovery performance**: Use checkpointing and incremental recovery - **Storage overhead**: Optimize offset storage and indexing ### Operational Risks - **Migration complexity**: Implement gradual rollout with rollback capability - **Data consistency**: Extensive testing of offset assignment and recovery - **Performance regression**: Continuous benchmarking and monitoring ## Timeline Estimate - Phase 1: 1-2 weeks - Phase 2: 2-3 weeks - Phase 3: 2-3 weeks - Phase 4: 1-2 weeks - Phase 5: 2-3 weeks - Phase 6: 2-3 weeks **Total: 10-16 weeks** ## Next Steps 1. Review and approve development plan 2. Set up development branch 3. Begin Phase 1 implementation 4. Establish testing and CI pipeline 5. Regular progress reviews and adjustments