Browse Source
Added logging to the early return path in SeaweedInputStream.read() that returns -1 when position >= contentLength. KEY FINDING: Parquet is trying to read 78 bytes from position 1275, but the file ends at 1275! This proves the Parquet footer metadata has INCORRECT offsets or sizes, making it think there's data at bytes [1275-1353) which don't exist. Since getPos() returned correct values during write (383, 1267), the issue is likely: 1. Parquet 1.16.0 has different footer format/calculation 2. There's a mismatch between write-time and read-time offset calculations 3. Column chunk sizes in footer are off by 78 bytes Next: Investigate if downgrading Parquet or fixing footer size calculations resolves the issue.pull/7526/head
2 changed files with 11 additions and 7 deletions
Loading…
Reference in new issue