chrislu
a3cf4eb843
debug: track stream lifecycle and total bytes written
Added comprehensive logging to identify why Parquet files fail with
'EOFException: Still have: 78 bytes left'.
Key additions:
1. SeaweedHadoopOutputStream constructor logging with 🔧 marker
- Shows when output streams are created
- Logs path, position, bufferSize, replication
2. totalBytesWritten counter in SeaweedOutputStream
- Tracks cumulative bytes written via write() calls
- Helps identify if Parquet wrote 762 bytes but only 684 reached chunks
3. Enhanced close() logging with 🔒 and ✅ markers
- Shows totalBytesWritten vs position vs buffer.position()
- If totalBytesWritten=762 but position=684, write submission failed
- If buffer.position()=78 at close, buffer wasn't flushed
Expected scenarios in next run:
A) Stream never created → No 🔧 log for .parquet files
B) Write failed → totalBytesWritten=762 but position=684
C) Buffer not flushed → buffer.position()=78 at close
D) All correct → totalBytesWritten=position=684, but Parquet expects 762
This will pinpoint whether the issue is in:
- Stream creation/lifecycle
- Write submission
- Buffer flushing
- Or Parquet's internal state
1 week ago
chrislu
c86177e063
add comments
1 week ago
chrislu
a7f786ac92
NPE
1 week ago
chrislu
c96448f3a5
more flexible replication configuration
1 week ago
dependabot[bot]
c14e513964
chore(deps): bump org.apache.hadoop:hadoop-common from 3.2.4 to 3.4.0 in /other/java/hdfs3 ( #7512 )
* chore(deps): bump org.apache.hadoop:hadoop-common in /other/java/hdfs3
Bumps org.apache.hadoop:hadoop-common from 3.2.4 to 3.4.0.
---
updated-dependencies:
- dependency-name: org.apache.hadoop:hadoop-common
dependency-version: 3.4.0
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
* add java client unit tests
* Update dependency-reduced-pom.xml
* add java integration tests
* fix
* fix buffer
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: chrislu <chris.lu@gmail.com>
2 weeks ago
orthoxerox
d8cc269294
feature: added ssl support for HCFS ( #6699 ) ( #6775 )
7 months ago
chrislu
d003bb0166
java 3.13
3 years ago
Chris Lu
e5fc35ed0c
change server address from string to a type
4 years ago
Chris Lu
ad36c7b0d7
refactoring: only expose FilerClient class
5 years ago
Chris Lu
9c1efdf11b
HCFS: 1.5.9
5 years ago
Chris Lu
8f3a51f2b8
Java: 1.5.8 additional fixes
5 years ago
Chris Lu
6f4aab51f9
refactoring SeaweedInputStream
5 years ago
Chris Lu
043c2d7960
refactoring SeaweedOutputStream
5 years ago
Chris Lu
4d2855476c
Hadoop: add BufferedByteBufferReadableInputStream
fix https://github.com/chrislusf/seaweedfs/issues/1645
5 years ago
Chris Lu
3857f9c840
Hadoop: switch to ByteBuffer
fix https://github.com/chrislusf/seaweedfs/issues/1645
5 years ago
Chris Lu
a9efaa6385
HDFS: implement ByteBufferReadable
fix https://github.com/chrislusf/seaweedfs/issues/1645
5 years ago
Chris Lu
f4abd01adf
filer: cache small file to filer store
5 years ago
limd
4737df597d
HCFS:
1. add replication parameter
2. fix close sequence
5 years ago
Chris Lu
c709059b69
HCFS: add close() to SeaweedFileSystem.java
5 years ago
Chris Lu
f375b93aef
renaming
5 years ago
Chris Lu
596d476e3d
HCFS: 1.5.2
5 years ago
Chris Lu
459de70a77
Hadoop: more accurate block size
5 years ago
Chris Lu
912ef2bc53
Hadoop: remove unused variable bufferSize
5 years ago
limd
ac162fc857
hdfs: Hadoop on SeaweedFS: create empty file
5 years ago
Chris Lu
4929d0634e
Hadoop on SeaweedFS: create empty file
fix https://github.com/chrislusf/seaweedfs/issues/1494
5 years ago
limd
95bfec4931
hadoop: filesystem cannot create file
issues: https://github.com/chrislusf/seaweedfs/issues/1494
5 years ago
Chris Lu
5eee4983f3
1.4.7 hdfs configurable fs.seaweed.buffer.size
5 years ago
Chris Lu
13bfe5deef
same logic for reading random access files from Go
5 years ago
Chris Lu
15dc0a704d
Revert "add read ahead input stream"
This reverts commit b3089dcc8e .
5 years ago
Chris Lu
b3089dcc8e
add read ahead input stream
5 years ago
Chris Lu
6b41c5250b
Hadoop file system: 1.4.3
added buffered fs input stream
5 years ago
Chris Lu
703057bff9
mirror changes from hdfs2
5 years ago
Chris Lu
6839f96c0c
simplify
5 years ago
Chris Lu
ae3e6d8244
remove changing buffer size
5 years ago
Chris Lu
1d724ab237
hdfs: support read write chunk manifest
5 years ago
Chris Lu
f90d2c93c9
1.3.9 remove logs
5 years ago
Chris Lu
3abd74b1d7
1.3.8
5 years ago
Chris Lu
2629da2cb9
simplify inputstream
5 years ago
李明达
74456b3d5e
1、Add SeaweedFS implementation of Hadoop AbstractFileSystem. The implementation delegates to the existing SeaweedFS FileSystem and is only necessary for use with Hadoop 2.x/3.x. Configuration example in Hadoop core-site.xml file:
<property>
<name>fs.AbstractFileSystem.seaweedfs.impl</name>
<value>seaweed.hdfs.SeaweedAbstractFileSystem</value>
</property>
2、Fix hiveserver2 startup NullPointerException
5 years ago
Chris Lu
7bca72deed
reuse bytebuffer
5 years ago
Chris Lu
86c8137546
1.3.4
5 years ago
Chris Lu
bc3be0bb37
Hadoop: 1.3.3
improve memory efficiency
5 years ago
Chris Lu
222f93e816
possibly fix concurrent access to entry object in SeaweedOutputStream
6 years ago
Chris Lu
8dfaaeabfd
HCFS: 1.2.8 fix hbase related bugs
1. SeaweedFileSystem.listStatus need to work with file also
2. SeaweedRead readChunkView has wrong len
6 years ago
Chris Lu
a999ed94d0
update hdfs client
6 years ago
Chris Lu
ae53f63680
filer: recursive deletion optionally ignoring any errors
fix https://github.com/chrislusf/seaweedfs/issues/1062
6 years ago
Chris Lu
fd509c3844
HCFS: working with HBase
6 years ago
Chris Lu
cb299dfaa2
HCFS: use latest grpc versions, separate hadoop2 and hadoop3
6 years ago
Chris Lu
170ee6ef0f
tmp
6 years ago