12254 Commits (785dbc6077f27ff9307db76a9aaedaf991ab8dc7)
 

Author SHA1 Message Date
chrislu 785dbc6077 clean up tests 1 week ago
chrislu 32118a82bc tests not needed now 1 week ago
chrislu 0fdf5f1a12 fixing hdfs3 1 week ago
chrislu 9b94f970c7 fix: implement flush-on-getPos() - still fails with 78-byte error 1 week ago
chrislu 580a6c1e00 debug: add rename logging - proves metadata IS preserved correctly 1 week ago
chrislu e878431dea docs: final investigation summary - issue is in rename operation 1 week ago
chrislu b44e51fae6 WIP: implement metadata visibility check in close() 1 week ago
chrislu 75f4195f25 docs: comprehensive analysis of I/O comparison findings 1 week ago
chrislu d045624994 test: comprehensive I/O comparison reveals timing/metadata issue 1 week ago
chrislu 6ae8b12917 test: prove I/O operations identical between local and SeaweedFS 1 week ago
chrislu d4d6836139 test: prove Spark CAN read Parquet files (both direct and Spark-written) 1 week ago
chrislu 1d78409440 test: prove Parquet works perfectly when written directly (not via Spark) 1 week ago
chrislu fba35124af experiment: prove chunk count irrelevant to 78-byte EOF error 1 week ago
chrislu f6b0c1e216 docs: comprehensive recommendation for Parquet EOF fix 1 week ago
chrislu 1cdb2fcf07 fix: implement flush-before-getPos() for Parquet compatibility 1 week ago
chrislu b019ec8f08 feat: comprehensive Parquet EOF debugging with multiple fix attempts 1 week ago
chrislu 2bf6e814f0 docs: complete debug session summary and findings 1 week ago
chrislu 9eb71466d8 feat: implement flush-on-getPos() to ensure accurate offsets 1 week ago
chrislu c1b0aa6611 feat: implement virtual position tracking in SeaweedOutputStream 1 week ago
chrislu 2d6b571120 docs: comprehensive analysis of Parquet EOF root cause and fix strategies 1 week ago
chrislu 3e754792a5 feat: add comprehensive debug logging to track Parquet write sequence 1 week ago
chrislu 7d601191a5 docs: complete local reproduction analysis with detailed findings 1 week ago
chrislu 852ca41928 docs: BREAKTHROUGH - found the bug in Spark local reproduction! 1 week ago
chrislu 50a8a3eb11 docs: comprehensive test results showing unit tests PASS but Spark fails 1 week ago
chrislu 80b463b7e4 test: add GetPosBufferTest to reproduce Parquet issue - ALL TESTS PASS! 1 week ago
chrislu 4faa6d55f6 docs: comprehensive issue summary - getPos() buffer flush timing issue 1 week ago
chrislu 8f33f5240d debug: confirmed root cause - Parquet tries to read 78 bytes past EOF 1 week ago
chrislu 16b8cf3e52 debug: add logging to EOF return path - FOUND ROOT CAUSE! 1 week ago
chrislu 216ae856ca docs: add comprehensive debugging analysis for EOF exception fix 1 week ago
chrislu 5c30bc8e7b debug: add detailed getPos() tracking with caller stack trace 1 week ago
chrislu e95f7061a4 fix: SeaweedInputStream returning 0 bytes for inline content reads 1 week ago
chrislu c10ae054b6 debug: add logging to SeaweedInputStream constructor to track contentLength 1 week ago
chrislu 9bb000e150 Update SeaweedOutputStream.java 1 week ago
chrislu d7d4d97098 debug: verify JARs contain latest code before running tests 1 week ago
chrislu 4936f733d1 debug: add WARN logging to SeaweedOutputStream base constructor 1 week ago
chrislu c834e30a72 debug: add logging to SeaweedFileSystemStore.createFile() 1 week ago
chrislu aed16ca9d7 fix: enable DEBUG logging for seaweed.hdfs package 1 week ago
chrislu 6fe5c372ee debug: change logs to WARN level to ensure visibility 1 week ago
chrislu c91175cb97 fix: make path variable final for anonymous inner class 1 week ago
chrislu d6f9234cea debug: add aggressive logging to FSDataOutputStream getPos() override 1 week ago
chrislu 58d4d61f89 docs: push instructions for Parquet EOF fix 1 week ago
chrislu 90aa83dbe4 docs: add detailed analysis of Parquet EOF fix 1 week ago
chrislu 9e7ed48688 fix: Override FSDataOutputStream.getPos() to use SeaweedOutputStream position 1 week ago
chrislu a8491ecd3f Update SeaweedOutputStream.java 1 week ago
chrislu 16bd118125 fix: don't split chunk ID on comma - comma is PART of the ID! 1 week ago
chrislu a1fa949221 feat: extract chunk IDs from write log and download from volume 1 week ago
chrislu c774b807e1 fix: search temporary directories for Parquet files 1 week ago
chrislu 7b9b04cd59 feat: add explicit logging when employees Parquet file is written 1 week ago
chrislu 09b0a2505c fix: poll for files to appear instead of fixed sleep 1 week ago
chrislu 64357e73bf feat: proactive download - grab files BEFORE Spark deletes them 1 week ago