Browse Source

fix: search for filename in 'Encountered error' message

The issue: grep pattern was wrong and looking in wrong place
- EOF exception is in the 'Caused by' section
- Filename is in the outer exception message

The fix:
- Search for 'Encountered error while reading file' line
- Extract filename: part-00000-xxx-c000.snappy.parquet
- Fixed regex pattern (was missing dash before c000)

Example from logs:
  'Encountered error while reading file seaweedfs://...part-00000-c5a41896-5221-4d43-a098-d0839f5745f6-c000.snappy.parquet'

This will finally extract the right filename!
pull/7526/head
chrislu 1 week ago
parent
commit
8e0635b8ba
  1. 10
      .github/workflows/spark-integration-tests.yml

10
.github/workflows/spark-integration-tests.yml

@ -138,11 +138,15 @@ jobs:
# Get the full log and extract the EXACT file causing the error
FULL_LOG=$(docker compose logs spark-tests 2>&1)
# Extract the failing filename from the EOF error message
# The error message format: "...seaweedfs://seaweedfs-filer:8888/test-spark/employees/part-xxx.parquet..."
FAILING_FILE=$(echo "$FULL_LOG" | grep -B 5 "EOFException.*78 bytes" | grep "seaweedfs://" | grep -oP 'part-[a-f0-9-]+\.c000\.snappy\.parquet' | head -1)
# Extract the failing filename from the error message
# Look for "Encountered error while reading file seaweedfs://...part-xxx-c000.snappy.parquet"
FAILING_FILE=$(echo "$FULL_LOG" | grep "Encountered error while reading file" | grep -oP 'part-[a-f0-9-]+-c000\.snappy\.parquet' | head -1)
echo "Failing file: $FAILING_FILE"
# Also show the full error line for debugging
echo "Full error context:"
echo "$FULL_LOG" | grep "Encountered error while reading file" | head -1
if [ -z "$FAILING_FILE" ]; then
echo "ERROR: Could not extract failing filename from error message"
echo "Searching for error message pattern..."

Loading…
Cancel
Save