# Debugging Breakthrough: EOF Exception Analysis ## Summary After extensive debugging, we've identified and partially fixed the root cause of the `EOFException: Still have: 78 bytes left` error in Parquet file reads. ## Root Cause Analysis ### Initial Hypothesis ❌ (Incorrect) - **Thought**: File size calculation was wrong (`contentLength` off by 78 bytes) - **Reality**: `contentLength` was **always correct** at 1275 bytes ### Second Hypothesis ❌ (Partially Correct) - **Thought**: `FSDataOutputStream.getPos()` wasn't delegating to `SeaweedOutputStream.getPos()` - **Reality**: The override **was working**, but there was a deeper issue ### Third Hypothesis ✅ (ROOT CAUSE) - **Problem**: `SeaweedInputStream.read(ByteBuffer buf)` was returning 0 bytes for inline content - **Location**: Line 127-129 in `SeaweedInputStream.java` - **Bug**: When copying inline content from protobuf entry, `bytesRead` was never updated ```java // BEFORE (BUGGY): if (this.position < Integer.MAX_VALUE && (this.position + len) <= entry.getContent().size()) { entry.getContent().substring((int) this.position, (int) (this.position + len)).copyTo(buf); // bytesRead stays 0! <-- BUG } else { bytesRead = SeaweedRead.read(...); } return (int) bytesRead; // Returns 0 when inline content was copied! ``` ```java // AFTER (FIXED): if (this.position < Integer.MAX_VALUE && (this.position + len) <= entry.getContent().size()) { entry.getContent().substring((int) this.position, (int) (this.position + len)).copyTo(buf); bytesRead = len; // FIX: Update bytesRead after inline copy } else { bytesRead = SeaweedRead.read(...); } return (int) bytesRead; // Now returns correct value! ``` ## Why This Caused EOF Errors 1. **Parquet's readFully() loop**: ```java while (remaining > 0) { int read = inputStream.read(buffer, offset, remaining); if (read == -1 || read == 0) { throw new EOFException("Still have: " + remaining + " bytes left"); } remaining -= read; } ``` 2. **Our bug**: When `read()` returned 0 instead of the actual bytes copied, Parquet thought the stream was done 3. **Result**: EOF exception with exactly the number of bytes that weren't reported ## Fixes Implemented ### 1. SeaweedInputStream.java (PRIMARY FIX) - **File**: `other/java/client/src/main/java/seaweedfs/client/SeaweedInputStream.java` - **Change**: Set `bytesRead = len` after inline content copy - **Impact**: Ensures `read()` always returns the correct number of bytes read ### 2. SeaweedOutputStream.java (DIAGNOSTIC) - **File**: `other/java/client/src/main/java/seaweedfs/client/SeaweedOutputStream.java` - **Change**: Added comprehensive logging to `getPos()` with stack traces - **Purpose**: Track who calls `getPos()` and what positions are returned - **Finding**: All positions appeared correct in tests ### 3. SeaweedFileSystem.java (ALREADY FIXED) - **File**: `other/java/hdfs3/src/main/java/seaweed/hdfs/SeaweedFileSystem.java` - **Change**: Override `FSDataOutputStream.getPos()` to delegate to `SeaweedOutputStream` - **Verification**: Confirmed working with WARN logs ### 4. Unit Test Added - **File**: `other/java/client/src/test/java/seaweedfs/client/SeaweedStreamIntegrationTest.java` - **Test**: `testRangeReads()` - **Coverage**: - Range reads at specific offsets (like Parquet footer reads) - Sequential `readFully()` pattern that was failing - Multiple small reads vs. large reads - The exact 78-byte read at offset 1197 that was failing ## Test Results ### Before Fix ``` EOFException: Reached the end of stream. Still have: 78 bytes left - contentLength: 1275 (correct!) - reads: position=1197 len=78 bytesRead=0 ❌ ``` ### After Fix ``` No EOF exceptions observed - contentLength: 1275 (correct) - reads: position=1197 len=78 bytesRead=78 ✅ ``` ## Why The 78-Byte Offset Was Consistent The "78 bytes" wasn't random - it was **systematically the last `read()` call** that returned 0 instead of the actual bytes: - File size: 1275 bytes - Last read: position=1197, len=78 - Expected: bytesRead=78 - Actual (before fix): bytesRead=0 - Parquet: "I need 78 more bytes but got EOF!" → EOFException ## Commits 1. **e95f7061a**: Fix inline content read bug + add unit test 2. **c10ae054b**: Add SeaweedInputStream constructor logging 3. **5c30bc8e7**: Add detailed getPos() tracking with stack traces ## Next Steps 1. **Push changes** to your branch 2. **Run CI tests** to verify fix works in GitHub Actions 3. **Monitor** for any remaining edge cases 4. **Remove debug logging** once confirmed stable (or reduce to DEBUG level) 5. **Backport** to other SeaweedFS client versions if needed ## Key Learnings 1. **Read the return value**: Always ensure functions return the correct value, not just perform side effects 2. **Buffer operations need tracking**: When copying data to buffers, track how much was copied 3. **Stack traces help**: Knowing WHO calls a function helps understand WHEN bugs occur 4. **Consistent offsets = systematic bug**: The 78-byte offset being consistent pointed to a logic error, not data corruption 5. **Downloaded file was perfect**: The fact that `parquet-tools` could read the downloaded file proved the bug was in the read path, not write path ## Files Modified ``` other/java/client/src/main/java/seaweedfs/client/SeaweedInputStream.java other/java/client/src/main/java/seaweedfs/client/SeaweedOutputStream.java other/java/client/src/main/java/seaweedfs/client/SeaweedRead.java other/java/client/src/test/java/seaweedfs/client/SeaweedStreamIntegrationTest.java other/java/hdfs3/src/main/java/seaweed/hdfs/SeaweedFileSystem.java other/java/hdfs3/src/main/java/seaweed/hdfs/SeaweedFileSystemStore.java other/java/hdfs3/src/main/java/seaweed/hdfs/SeaweedHadoopOutputStream.java ``` ## References - Issue: Spark integration tests failing with EOF exception - Parquet version: 1.16.0 - Spark version: 3.5.0 - SeaweedFS client version: 3.80.1-SNAPSHOT