Browse Source

fix: don't split chunk ID on comma - comma is PART of the ID!

CRITICAL BUG FIX: Chunk ID format is 'volumeId,fileKey' (e.g., '3,0307c52bab')

The problem:
- Log shows: CHUNKS: [3,0307c52bab]
- Script was splitting on comma: IFS=','
- Tried to download: '3' (404) and '0307c52bab' (404)
- Both failed!

The fix:
- Chunk ID is a SINGLE string with embedded comma
- Don't split it!
- Download directly: http://localhost:8080/3,0307c52bab

This should finally work!
pull/7526/head
chrislu 1 week ago
parent
commit
16bd118125
  1. 28
      .github/workflows/spark-integration-tests.yml

28
.github/workflows/spark-integration-tests.yml

@ -155,21 +155,19 @@ jobs:
fi
if [ -n "$CHUNK_IDS" ]; then
# Download each chunk (usually just one for small files)
IFS=',' read -ra CHUNKS <<< "$CHUNK_IDS"
for CHUNK_ID in "${CHUNKS[@]}"; do
echo "Downloading chunk from volume server: http://localhost:8080/$CHUNK_ID"
curl -o "test.parquet" "http://localhost:8080/$CHUNK_ID"
if [ -f test.parquet ] && [ -s test.parquet ]; then
FILE_SIZE=$(stat --format=%s test.parquet 2>/dev/null || stat -f%z test.parquet 2>/dev/null)
echo "SUCCESS: Downloaded $FILE_SIZE bytes from volume server!"
DOWNLOADED=true
break
else
echo "FAILED: Chunk $CHUNK_ID returned 404 or empty"
fi
done
# CHUNK_IDS might have multiple chunks, but usually just one
# Format: "3,abc123" or "3,abc123,4,def456" (comma WITHIN each ID!)
# We need to split by space or handle single chunk
echo "Downloading chunk from volume server: http://localhost:8080/$CHUNK_IDS"
curl -o "test.parquet" "http://localhost:8080/$CHUNK_IDS"
if [ -f test.parquet ] && [ -s test.parquet ]; then
FILE_SIZE=$(stat --format=%s test.parquet 2>/dev/null || stat -f%z test.parquet 2>/dev/null)
echo "SUCCESS: Downloaded $FILE_SIZE bytes from volume server!"
DOWNLOADED=true
else
echo "FAILED: Chunk $CHUNK_IDS returned 404 or empty"
fi
else
echo "ERROR: Could not extract chunk IDs"
fi

Loading…
Cancel
Save