Added diagnostic step to download and examine actual Parquet files
when tests fail. This will definitively answer:
1. Is the file complete? (Check PAR1 magic bytes at start/end)
2. What size is it? (Compare actual vs expected)
3. Can parquet-tools read it? (Reader compatibility test)
4. What does the footer contain? (Hex dump last 200 bytes)
Steps performed:
- List files in SeaweedFS
- Download first Parquet file
- Check magic bytes (PAR1 at offset 0 and EOF-4)
- Show file size from filesystem
- Hex dump header (first 100 bytes)
- Hex dump footer (last 200 bytes)
- Run parquet-tools inspect/show
- Upload file as artifact for local analysis
This will reveal if the issue is:
A) File is incomplete (missing trailer) → SeaweedFS write problem
B) File is complete but unreadable → Parquet format problem
C) File is complete and readable → SeaweedFS read problem
D) File size doesn't match metadata → Footer offset problem
The downloaded file will be available as 'failed-parquet-file' artifact.