Browse Source
fix: SeaweedInputStream returning 0 bytes for inline content reads
fix: SeaweedInputStream returning 0 bytes for inline content reads
ROOT CAUSE IDENTIFIED:
In SeaweedInputStream.read(ByteBuffer buf), when reading inline content
(stored directly in the protobuf entry), the code was copying data to
the buffer but NOT updating bytesRead, causing it to return 0.
This caused Parquet's H2SeekableInputStream.readFully() to fail with:
"EOFException: Still have: 78 bytes left"
The readFully() method calls read() in a loop until all requested bytes
are read. When read() returns 0 or -1 prematurely, it throws EOF.
CHANGES:
1. SeaweedInputStream.java:
- Fixed inline content read to set bytesRead = len after copying
- Added debug logging to track position, len, and bytesRead
- This ensures read() always returns the actual number of bytes read
2. SeaweedStreamIntegrationTest.java:
- Added comprehensive testRangeReads() that simulates Parquet behavior:
* Seeks to specific offsets (like reading footer at end)
* Reads specific byte ranges (like reading column chunks)
* Uses readFully() pattern with multiple sequential read() calls
* Tests the exact scenario that was failing (78-byte read at offset 1197)
- This test will catch any future regressions in range read behavior
VERIFICATION:
Local testing showed:
- contentLength correctly set to 1275 bytes
- Chunk download retrieved all 1275 bytes from volume server
- BUT read() was returning -1 before fulfilling Parquet's request
- After fix, test compiles successfully
Related to: Spark integration test failures with Parquet files
pull/7526/head
2 changed files with 122 additions and 11 deletions
-
31other/java/client/src/main/java/seaweedfs/client/SeaweedInputStream.java
-
102other/java/client/src/test/java/seaweedfs/client/SeaweedStreamIntegrationTest.java
Write
Preview
Loading…
Cancel
Save
Reference in new issue