seaweedfs

History

chrislu 7e755c70ce feat: add in-memory cache for disk chunk reads This commit adds an LRU cache for disk chunks to optimize repeated reads of historical data. When multiple consumers read the same historical offsets, or a single consumer refetches the same data, the cache eliminates redundant disk I/O. Cache Design: - Chunk size: 1000 messages per chunk - Max chunks: 16 (configurable, ~16K messages cached) - Eviction policy: LRU (Least Recently Used) - Thread-safe with RWMutex - Chunk-aligned offsets for efficient lookups New Components: 1. DiskChunkCache struct - manages cached chunks 2. CachedDiskChunk struct - stores chunk data with metadata 3. getCachedDiskChunk() - checks cache before disk read 4. cacheDiskChunk() - stores chunks with LRU eviction 5. extractMessagesFromCache() - extracts subset from cached chunk How It Works: 1. Read request for offset N (e.g., 2500) 2. Calculate chunk start: (2500 / 1000) * 1000 = 2000 3. Check cache for chunk starting at 2000 4. If HIT: Extract messages 2500-2999 from cached chunk 5. If MISS: Read chunk 2000-2999 from disk, cache it, extract 2500-2999 6. If cache full: Evict LRU chunk before caching new one Benefits: - Eliminates redundant disk I/O for popular historical data - Reduces latency for repeated reads (cache hit ~1ms vs disk ~100ms) - Supports multiple consumers reading same historical offsets - Automatically evicts old chunks when cache is full - Zero impact on hot path (in-memory reads unchanged) Performance Impact: - Cache HIT: ~99% faster than disk read - Cache MISS: Same as disk read (with caching overhead ~1%) - Memory: ~16MB for 16 chunks (16K messages x 1KB avg) Example Scenario (CI tests): - Producer writes offsets 0-4 - Data flushes to disk - Consumer 1 reads 0-4 (cache MISS, reads from disk, caches chunk 0-999) - Consumer 2 reads 0-4 (cache HIT, served from memory) - Consumer 1 rebalances, re-reads 0-4 (cache HIT, no disk I/O) This optimization is especially valuable in CI environments where: - Small memory buffers cause frequent flushing - Multiple consumers read the same historical data - Disk I/O is relatively slow compared to memory access		2 months ago
..
buffer_pool	[volume] Reduce the number of buffers for uploading one chunk (#5458)	2 years ago
buffered_queue	ensure head index is within range	2 years ago
buffered_writer	webdav: improve webdav upload speed	5 years ago
chunk_cache	FUSE Mount: enhance disk cache with volume ID and cookie validation (#7269)	3 months ago
constants	Fix #7305: Return 400 BadDigest instead of 500 InternalError for MD5 mismatch (#7306)	3 months ago
fla9	go fmt	2 years ago
grace	master fix interruption through ctrl+c (#3834)	3 years ago
http	S3 API: Add SSE-KMS (#7144)	4 months ago
httpdown	test passed	6 years ago
log_buffer	feat: add in-memory cache for disk chunk reads	2 months ago
mem	remove logs	3 years ago
request_id	Context-based logging with request ID (#6899)	6 months ago
skiplist	S3 API: Advanced IAM System (#7160)	4 months ago
sqlutil	Message Queue: Add sql querying (#7185)	4 months ago
version	3.97	4 months ago
bytes.go	randomizing next file handle id	1 year ago
bytes_test.go	minFreeSpace refactored	5 years ago
cipher.go	move to https://github.com/seaweedfs/seaweedfs	3 years ago
cipher_test.go	refactor(cipher_test): `plantext` -> `plaintext` (#3669)	3 years ago
compression.go	move to https://github.com/seaweedfs/seaweedfs	3 years ago
compression_stream.go	reduce gzip allocation	4 years ago
concurrent_read_map.go	directory structure change to work with glide	10 years ago
cond_wait.go	[volume] refactor and add metrics for flight upload and download data limit condition (#6920)	6 months ago
config.go	"golang.org/x/exp/slices" => "slices" and go fmt	1 year ago
constants_4bytes.go	change version directory	7 months ago
constants_5bytes.go	change version directory	7 months ago
file_util.go	fix: avoid error file name too long when writing a file (#4876)	2 years ago
file_util_non_posix.go	go fmt	4 years ago
file_util_posix.go	go fmt	4 years ago
file_util_test.go	fix: avoid error file name too long when writing a file (#4876)	2 years ago
fullpath.go	Read write directory object (#7003)	5 months ago
inits.go	try showing the first 100 volume ids and an extra ...	6 years ago
inits_test.go	try showing the first 100 volume ids and an extra ...	6 years ago
limited_async_pool.go	add future list	3 years ago
limited_async_pool_test.go	add future list	3 years ago
limited_executor.go	rename	3 years ago
lock_table.go	Fix deadlock in lock table locks (#5566)	2 years ago
lock_table_test.go	adjust logs	2 years ago
minfreespace.go	move to https://github.com/seaweedfs/seaweedfs	3 years ago
minfreespace_test.go	minFreeSpace refactored	5 years ago
net_timeout.go	refactor(various): `Listner` -> `Listener` readability improvements (#3672)	3 years ago
network.go	move to https://github.com/seaweedfs/seaweedfs	3 years ago
parse.go	cloud tier: support for Alibaba Cloud OSS (#6466)	11 months ago
queue.go	"golang.org/x/exp/slices" => "slices" and go fmt	1 year ago
queue_test.go	add a simple test	1 year ago
queue_unbounded.go	refactor(queue_unbounded): `inbountLen` -> `inboundLen` (#3666)	3 years ago
queue_unbounded_test.go	filer: avoid possible timeouts for updates and deletions	6 years ago
reflect.go	show RemoteVolumes/EcVolumes only if it is not empty	5 years ago
retry.go	comment	2 years ago
throttler.go	refactor: extract out the write throttler	7 years ago
time.go	Fix 6181/6182 (#6183)	1 year ago