Browse Source

remove hdfs2

pull/7526/head
chrislu 1 week ago
parent
commit
972c2b12c8
  1. 5
      .github/workflows/java_integration_tests.yml
  2. 6
      .github/workflows/java_unit_tests.yml
  3. 8
      .github/workflows/spark-integration-tests.yml
  4. 2
      other/java/examples/pom.xml
  5. 199
      other/java/hdfs2/README.md
  6. 578
      other/java/hdfs2/dependency-reduced-pom.xml
  7. 195
      other/java/hdfs2/pom.xml
  8. 25
      other/java/hdfs2/src/main/java/seaweed/hdfs/BufferedByteBufferReadableInputStream.java
  9. 35
      other/java/hdfs2/src/main/java/seaweed/hdfs/SeaweedAbstractFileSystem.java
  10. 643
      other/java/hdfs2/src/main/java/seaweed/hdfs/SeaweedFileSystem.java
  11. 291
      other/java/hdfs2/src/main/java/seaweed/hdfs/SeaweedFileSystemStore.java
  12. 150
      other/java/hdfs2/src/main/java/seaweed/hdfs/SeaweedHadoopInputStream.java
  13. 16
      other/java/hdfs2/src/main/java/seaweed/hdfs/SeaweedHadoopOutputStream.java
  14. 90
      other/java/hdfs2/src/test/java/seaweed/hdfs/SeaweedFileSystemConfigTest.java
  15. 379
      other/java/hdfs2/src/test/java/seaweed/hdfs/SeaweedFileSystemTest.java
  16. 109
      other/java/hdfs3/src/main/java/seaweed/hdfs/SeaweedAtomicOutputStream.java
  17. 132
      test/java/spark/COMMIT_SUMMARY.md

5
.github/workflows/java_integration_tests.yml

@ -128,11 +128,6 @@ jobs:
run: |
mvn test -Dtest=*IntegrationTest
- name: Run HDFS2 Configuration Tests
working-directory: other/java/hdfs2
run: |
mvn test -Dtest=SeaweedFileSystemConfigTest -Dmaven.javadoc.skip=true -Dgpg.skip=true
- name: Run HDFS3 Configuration Tests
working-directory: other/java/hdfs3
run: |

6
.github/workflows/java_unit_tests.yml

@ -42,11 +42,6 @@ jobs:
run: |
mvn test -Dtest=SeaweedReadTest,SeaweedCipherTest
- name: Run HDFS2 Configuration Tests
working-directory: other/java/hdfs2
run: |
mvn test -Dtest=SeaweedFileSystemConfigTest -Dmaven.javadoc.skip=true -Dgpg.skip=true
- name: Run HDFS3 Configuration Tests
working-directory: other/java/hdfs3
run: |
@ -59,6 +54,5 @@ jobs:
name: test-reports-java-${{ matrix.java }}
path: |
other/java/client/target/surefire-reports/
other/java/hdfs2/target/surefire-reports/
other/java/hdfs3/target/surefire-reports/

8
.github/workflows/spark-integration-tests.yml

@ -4,14 +4,12 @@ on:
push:
paths:
- 'test/java/spark/**'
- 'other/java/hdfs2/**'
- 'other/java/hdfs3/**'
- 'other/java/client/**'
- '.github/workflows/spark-integration-tests.yml'
pull_request:
paths:
- 'test/java/spark/**'
- 'other/java/hdfs2/**'
- 'other/java/hdfs3/**'
- 'other/java/client/**'
- '.github/workflows/spark-integration-tests.yml'
@ -65,12 +63,6 @@ jobs:
echo "OK Java client built"
cd ../../..
echo "Building HDFS2 client..."
cd other/java/hdfs2
mvn clean install -U -DskipTests -Dgpg.skip=true -Dcentral.publishing.skip=true
echo "OK HDFS2 client built"
cd ../../..
echo "Building HDFS3 client..."
cd other/java/hdfs3
mvn clean install -U -DskipTests -Dgpg.skip=true -Dcentral.publishing.skip=true

2
other/java/examples/pom.xml

@ -16,7 +16,7 @@
</dependency>
<dependency>
<groupId>com.seaweedfs</groupId>
<artifactId>seaweedfs-hadoop2-client</artifactId>
<artifactId>seaweedfs-hadoop3-client</artifactId>
<version>3.80</version>
<scope>compile</scope>
</dependency>

199
other/java/hdfs2/README.md

@ -1,199 +0,0 @@
# SeaweedFS Hadoop2 Client
Hadoop FileSystem implementation for SeaweedFS, compatible with Hadoop 2.x/3.x.
## Building
```bash
mvn clean install
```
## Testing
This project includes two types of tests:
### 1. Configuration Tests (No SeaweedFS Required)
These tests verify configuration handling and initialization logic without requiring a running SeaweedFS instance:
```bash
mvn test -Dtest=SeaweedFileSystemConfigTest
```
### 2. Integration Tests (Requires SeaweedFS)
These tests verify actual FileSystem operations against a running SeaweedFS instance.
#### Prerequisites
1. Start SeaweedFS with default ports:
```bash
# Terminal 1: Start master
weed master
# Terminal 2: Start volume server
weed volume -mserver=localhost:9333
# Terminal 3: Start filer
weed filer -master=localhost:9333
```
2. Verify services are running:
- Master: http://localhost:9333
- Filer HTTP: http://localhost:8888
- Filer gRPC: localhost:18888
#### Running Integration Tests
```bash
# Enable integration tests
export SEAWEEDFS_TEST_ENABLED=true
# Run all tests
mvn test
# Run specific test
mvn test -Dtest=SeaweedFileSystemTest
```
### Test Configuration
Integration tests can be configured via environment variables or system properties:
- `SEAWEEDFS_TEST_ENABLED`: Set to `true` to enable integration tests (default: false)
- Tests use these default connection settings:
- Filer Host: localhost
- Filer HTTP Port: 8888
- Filer gRPC Port: 18888
### Running Tests with Custom Configuration
To test against a different SeaweedFS instance, modify the test code or use Hadoop configuration:
```java
conf.set("fs.seaweed.filer.host", "your-host");
conf.setInt("fs.seaweed.filer.port", 8888);
conf.setInt("fs.seaweed.filer.port.grpc", 18888);
```
## Test Coverage
The test suite covers:
- **Configuration & Initialization**
- URI parsing and configuration
- Default values
- Configuration overrides
- Working directory management
- **File Operations**
- Create files
- Read files
- Write files
- Append to files
- Delete files
- **Directory Operations**
- Create directories
- List directory contents
- Delete directories (recursive and non-recursive)
- **Metadata Operations**
- Get file status
- Set permissions
- Set owner/group
- Rename files and directories
## Usage in Hadoop
1. Copy the built JAR to your Hadoop classpath:
```bash
cp target/seaweedfs-hadoop2-client-*.jar $HADOOP_HOME/share/hadoop/common/lib/
```
2. Configure `core-site.xml`:
```xml
<configuration>
<property>
<name>fs.seaweedfs.impl</name>
<value>seaweed.hdfs.SeaweedFileSystem</value>
</property>
<property>
<name>fs.seaweed.filer.host</name>
<value>localhost</value>
</property>
<property>
<name>fs.seaweed.filer.port</name>
<value>8888</value>
</property>
<property>
<name>fs.seaweed.filer.port.grpc</name>
<value>18888</value>
</property>
<!-- Optional: Replication configuration with three priority levels:
1) If set to non-empty value (e.g. "001") - uses that value
2) If set to empty string "" - uses SeaweedFS filer's default replication
3) If not configured (property not present) - uses HDFS replication parameter
-->
<!-- <property>
<name>fs.seaweed.replication</name>
<value>001</value>
</property> -->
</configuration>
```
3. Use SeaweedFS with Hadoop commands:
```bash
hadoop fs -ls seaweedfs://localhost:8888/
hadoop fs -mkdir seaweedfs://localhost:8888/test
hadoop fs -put local.txt seaweedfs://localhost:8888/test/
```
## Continuous Integration
For CI environments, tests can be run in two modes:
1. **Configuration Tests Only** (default, no SeaweedFS required):
```bash
mvn test -Dtest=SeaweedFileSystemConfigTest
```
2. **Full Integration Tests** (requires SeaweedFS):
```bash
# Start SeaweedFS in CI environment
# Then run:
export SEAWEEDFS_TEST_ENABLED=true
mvn test
```
## Troubleshooting
### Tests are skipped
If you see "Skipping test - SEAWEEDFS_TEST_ENABLED not set":
```bash
export SEAWEEDFS_TEST_ENABLED=true
```
### Connection refused errors
Ensure SeaweedFS is running and accessible:
```bash
curl http://localhost:8888/
```
### gRPC errors
Verify the gRPC port is accessible:
```bash
# Should show the port is listening
netstat -an | grep 18888
```
## Contributing
When adding new features, please include:
1. Configuration tests (no SeaweedFS required)
2. Integration tests (with SEAWEEDFS_TEST_ENABLED guard)
3. Documentation updates

578
other/java/hdfs2/dependency-reduced-pom.xml

@ -1,578 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.seaweedfs</groupId>
<artifactId>seaweedfs-hadoop2-client</artifactId>
<name>SeaweedFS HDFS2 Client</name>
<version>${seaweedfs.client.version}</version>
<description>A java client for SeaweedFS.</description>
<url>https://github.com/seaweedfs/seaweedfs</url>
<developers>
<developer>
<name>Chris Lu</name>
<email>chris.lu@gmail.com</email>
<organization>SeaweedFS</organization>
<organizationUrl>https://seaweedfs.com</organizationUrl>
</developer>
</developers>
<licenses>
<license>
<name>The Apache License, Version 2.0</name>
<url>http://www.apache.org/licenses/LICENSE-2.0.txt</url>
</license>
</licenses>
<scm>
<connection>scm:git:git://github.com/seaweedfs/seaweedfs.git</connection>
<developerConnection>scm:git:ssh://github.com:seaweedfs/seaweedfs.git</developerConnection>
<url>https://github.com/seaweedfs/seaweedfs/tree/master</url>
</scm>
<build>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>8</source>
<target>8</target>
<release>8</release>
</configuration>
</plugin>
<plugin>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
<exclude>org/slf4j/**</exclude>
<exclude>META-INF/maven/org.slf4j/**</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer />
</transformers>
<relocations>
<relocation>
<pattern>com.google</pattern>
<shadedPattern>shaded.com.google</shadedPattern>
</relocation>
<relocation>
<pattern>io.grpc.internal</pattern>
<shadedPattern>shaded.io.grpc.internal</shadedPattern>
</relocation>
<relocation>
<pattern>org.apache.commons</pattern>
<shadedPattern>shaded.org.apache.commons</shadedPattern>
<excludes>
<exclude>org.apache.hadoop</exclude>
<exclude>org.apache.log4j</exclude>
</excludes>
</relocation>
<relocation>
<pattern>org.apache.http</pattern>
<shadedPattern>shaded.org.apache.http</shadedPattern>
</relocation>
</relocations>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-gpg-plugin</artifactId>
<version>1.5</version>
<executions>
<execution>
<id>sign-artifacts</id>
<phase>verify</phase>
<goals>
<goal>sign</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.sonatype.central</groupId>
<artifactId>central-publishing-maven-plugin</artifactId>
<version>0.5.0</version>
<extensions>true</extensions>
<configuration>
<publishingServerId>central</publishingServerId>
<autoPublish>true</autoPublish>
</configuration>
</plugin>
<plugin>
<artifactId>maven-source-plugin</artifactId>
<version>2.2.1</version>
<executions>
<execution>
<id>attach-sources</id>
<goals>
<goal>jar-no-fork</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-javadoc-plugin</artifactId>
<version>3.0.1</version>
<executions>
<execution>
<id>attach-javadocs</id>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>3.4.0</version>
<scope>provided</scope>
<exclusions>
<exclusion>
<artifactId>hadoop-hdfs-client</artifactId>
<groupId>org.apache.hadoop</groupId>
</exclusion>
<exclusion>
<artifactId>hadoop-yarn-api</artifactId>
<groupId>org.apache.hadoop</groupId>
</exclusion>
<exclusion>
<artifactId>hadoop-yarn-client</artifactId>
<groupId>org.apache.hadoop</groupId>
</exclusion>
<exclusion>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<groupId>org.apache.hadoop</groupId>
</exclusion>
<exclusion>
<artifactId>hadoop-mapreduce-client-jobclient</artifactId>
<groupId>org.apache.hadoop</groupId>
</exclusion>
<exclusion>
<artifactId>hadoop-annotations</artifactId>
<groupId>org.apache.hadoop</groupId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>3.4.0</version>
<scope>provided</scope>
<exclusions>
<exclusion>
<artifactId>hadoop-shaded-protobuf_3_21</artifactId>
<groupId>org.apache.hadoop.thirdparty</groupId>
</exclusion>
<exclusion>
<artifactId>hadoop-shaded-guava</artifactId>
<groupId>org.apache.hadoop.thirdparty</groupId>
</exclusion>
<exclusion>
<artifactId>commons-cli</artifactId>
<groupId>commons-cli</groupId>
</exclusion>
<exclusion>
<artifactId>commons-math3</artifactId>
<groupId>org.apache.commons</groupId>
</exclusion>
<exclusion>
<artifactId>commons-io</artifactId>
<groupId>commons-io</groupId>
</exclusion>
<exclusion>
<artifactId>commons-net</artifactId>
<groupId>commons-net</groupId>
</exclusion>
<exclusion>
<artifactId>commons-collections</artifactId>
<groupId>commons-collections</groupId>
</exclusion>
<exclusion>
<artifactId>javax.servlet-api</artifactId>
<groupId>javax.servlet</groupId>
</exclusion>
<exclusion>
<artifactId>jakarta.activation-api</artifactId>
<groupId>jakarta.activation</groupId>
</exclusion>
<exclusion>
<artifactId>jetty-server</artifactId>
<groupId>org.eclipse.jetty</groupId>
</exclusion>
<exclusion>
<artifactId>jetty-util</artifactId>
<groupId>org.eclipse.jetty</groupId>
</exclusion>
<exclusion>
<artifactId>jetty-servlet</artifactId>
<groupId>org.eclipse.jetty</groupId>
</exclusion>
<exclusion>
<artifactId>jetty-webapp</artifactId>
<groupId>org.eclipse.jetty</groupId>
</exclusion>
<exclusion>
<artifactId>jsp-api</artifactId>
<groupId>javax.servlet.jsp</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-core</artifactId>
<groupId>com.sun.jersey</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-servlet</artifactId>
<groupId>com.sun.jersey</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-json</artifactId>
<groupId>com.github.pjfanning</groupId>
</exclusion>
<exclusion>
<artifactId>jettison</artifactId>
<groupId>org.codehaus.jettison</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-server</artifactId>
<groupId>com.sun.jersey</groupId>
</exclusion>
<exclusion>
<artifactId>reload4j</artifactId>
<groupId>ch.qos.reload4j</groupId>
</exclusion>
<exclusion>
<artifactId>commons-beanutils</artifactId>
<groupId>commons-beanutils</groupId>
</exclusion>
<exclusion>
<artifactId>commons-configuration2</artifactId>
<groupId>org.apache.commons</groupId>
</exclusion>
<exclusion>
<artifactId>commons-lang3</artifactId>
<groupId>org.apache.commons</groupId>
</exclusion>
<exclusion>
<artifactId>commons-text</artifactId>
<groupId>org.apache.commons</groupId>
</exclusion>
<exclusion>
<artifactId>slf4j-reload4j</artifactId>
<groupId>org.slf4j</groupId>
</exclusion>
<exclusion>
<artifactId>avro</artifactId>
<groupId>org.apache.avro</groupId>
</exclusion>
<exclusion>
<artifactId>re2j</artifactId>
<groupId>com.google.re2j</groupId>
</exclusion>
<exclusion>
<artifactId>hadoop-auth</artifactId>
<groupId>org.apache.hadoop</groupId>
</exclusion>
<exclusion>
<artifactId>jsch</artifactId>
<groupId>com.jcraft</groupId>
</exclusion>
<exclusion>
<artifactId>curator-client</artifactId>
<groupId>org.apache.curator</groupId>
</exclusion>
<exclusion>
<artifactId>curator-recipes</artifactId>
<groupId>org.apache.curator</groupId>
</exclusion>
<exclusion>
<artifactId>zookeeper</artifactId>
<groupId>org.apache.zookeeper</groupId>
</exclusion>
<exclusion>
<artifactId>netty-handler</artifactId>
<groupId>io.netty</groupId>
</exclusion>
<exclusion>
<artifactId>netty-transport-native-epoll</artifactId>
<groupId>io.netty</groupId>
</exclusion>
<exclusion>
<artifactId>metrics-core</artifactId>
<groupId>io.dropwizard.metrics</groupId>
</exclusion>
<exclusion>
<artifactId>commons-compress</artifactId>
<groupId>org.apache.commons</groupId>
</exclusion>
<exclusion>
<artifactId>bcprov-jdk15on</artifactId>
<groupId>org.bouncycastle</groupId>
</exclusion>
<exclusion>
<artifactId>kerb-core</artifactId>
<groupId>org.apache.kerby</groupId>
</exclusion>
<exclusion>
<artifactId>jackson-databind</artifactId>
<groupId>com.fasterxml.jackson.core</groupId>
</exclusion>
<exclusion>
<artifactId>stax2-api</artifactId>
<groupId>org.codehaus.woodstox</groupId>
</exclusion>
<exclusion>
<artifactId>woodstox-core</artifactId>
<groupId>com.fasterxml.woodstox</groupId>
</exclusion>
<exclusion>
<artifactId>dnsjava</artifactId>
<groupId>dnsjava</groupId>
</exclusion>
<exclusion>
<artifactId>snappy-java</artifactId>
<groupId>org.xerial.snappy</groupId>
</exclusion>
<exclusion>
<artifactId>hadoop-annotations</artifactId>
<groupId>org.apache.hadoop</groupId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.13.1</version>
<scope>test</scope>
<exclusions>
<exclusion>
<artifactId>hamcrest-core</artifactId>
<groupId>org.hamcrest</groupId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-core</artifactId>
<version>3.12.4</version>
<scope>test</scope>
<exclusions>
<exclusion>
<artifactId>byte-buddy</artifactId>
<groupId>net.bytebuddy</groupId>
</exclusion>
<exclusion>
<artifactId>byte-buddy-agent</artifactId>
<groupId>net.bytebuddy</groupId>
</exclusion>
<exclusion>
<artifactId>objenesis</artifactId>
<groupId>org.objenesis</groupId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>3.4.0</version>
<type>test-jar</type>
<scope>test</scope>
<exclusions>
<exclusion>
<artifactId>hadoop-shaded-protobuf_3_21</artifactId>
<groupId>org.apache.hadoop.thirdparty</groupId>
</exclusion>
<exclusion>
<artifactId>hadoop-shaded-guava</artifactId>
<groupId>org.apache.hadoop.thirdparty</groupId>
</exclusion>
<exclusion>
<artifactId>commons-cli</artifactId>
<groupId>commons-cli</groupId>
</exclusion>
<exclusion>
<artifactId>commons-math3</artifactId>
<groupId>org.apache.commons</groupId>
</exclusion>
<exclusion>
<artifactId>commons-io</artifactId>
<groupId>commons-io</groupId>
</exclusion>
<exclusion>
<artifactId>commons-net</artifactId>
<groupId>commons-net</groupId>
</exclusion>
<exclusion>
<artifactId>commons-collections</artifactId>
<groupId>commons-collections</groupId>
</exclusion>
<exclusion>
<artifactId>javax.servlet-api</artifactId>
<groupId>javax.servlet</groupId>
</exclusion>
<exclusion>
<artifactId>jakarta.activation-api</artifactId>
<groupId>jakarta.activation</groupId>
</exclusion>
<exclusion>
<artifactId>jetty-server</artifactId>
<groupId>org.eclipse.jetty</groupId>
</exclusion>
<exclusion>
<artifactId>jetty-util</artifactId>
<groupId>org.eclipse.jetty</groupId>
</exclusion>
<exclusion>
<artifactId>jetty-servlet</artifactId>
<groupId>org.eclipse.jetty</groupId>
</exclusion>
<exclusion>
<artifactId>jetty-webapp</artifactId>
<groupId>org.eclipse.jetty</groupId>
</exclusion>
<exclusion>
<artifactId>jsp-api</artifactId>
<groupId>javax.servlet.jsp</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-core</artifactId>
<groupId>com.sun.jersey</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-servlet</artifactId>
<groupId>com.sun.jersey</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-json</artifactId>
<groupId>com.github.pjfanning</groupId>
</exclusion>
<exclusion>
<artifactId>jettison</artifactId>
<groupId>org.codehaus.jettison</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-server</artifactId>
<groupId>com.sun.jersey</groupId>
</exclusion>
<exclusion>
<artifactId>reload4j</artifactId>
<groupId>ch.qos.reload4j</groupId>
</exclusion>
<exclusion>
<artifactId>commons-beanutils</artifactId>
<groupId>commons-beanutils</groupId>
</exclusion>
<exclusion>
<artifactId>commons-configuration2</artifactId>
<groupId>org.apache.commons</groupId>
</exclusion>
<exclusion>
<artifactId>commons-lang3</artifactId>
<groupId>org.apache.commons</groupId>
</exclusion>
<exclusion>
<artifactId>commons-text</artifactId>
<groupId>org.apache.commons</groupId>
</exclusion>
<exclusion>
<artifactId>slf4j-reload4j</artifactId>
<groupId>org.slf4j</groupId>
</exclusion>
<exclusion>
<artifactId>avro</artifactId>
<groupId>org.apache.avro</groupId>
</exclusion>
<exclusion>
<artifactId>re2j</artifactId>
<groupId>com.google.re2j</groupId>
</exclusion>
<exclusion>
<artifactId>hadoop-auth</artifactId>
<groupId>org.apache.hadoop</groupId>
</exclusion>
<exclusion>
<artifactId>jsch</artifactId>
<groupId>com.jcraft</groupId>
</exclusion>
<exclusion>
<artifactId>curator-client</artifactId>
<groupId>org.apache.curator</groupId>
</exclusion>
<exclusion>
<artifactId>curator-recipes</artifactId>
<groupId>org.apache.curator</groupId>
</exclusion>
<exclusion>
<artifactId>zookeeper</artifactId>
<groupId>org.apache.zookeeper</groupId>
</exclusion>
<exclusion>
<artifactId>netty-handler</artifactId>
<groupId>io.netty</groupId>
</exclusion>
<exclusion>
<artifactId>netty-transport-native-epoll</artifactId>
<groupId>io.netty</groupId>
</exclusion>
<exclusion>
<artifactId>metrics-core</artifactId>
<groupId>io.dropwizard.metrics</groupId>
</exclusion>
<exclusion>
<artifactId>commons-compress</artifactId>
<groupId>org.apache.commons</groupId>
</exclusion>
<exclusion>
<artifactId>bcprov-jdk15on</artifactId>
<groupId>org.bouncycastle</groupId>
</exclusion>
<exclusion>
<artifactId>kerb-core</artifactId>
<groupId>org.apache.kerby</groupId>
</exclusion>
<exclusion>
<artifactId>jackson-databind</artifactId>
<groupId>com.fasterxml.jackson.core</groupId>
</exclusion>
<exclusion>
<artifactId>stax2-api</artifactId>
<groupId>org.codehaus.woodstox</groupId>
</exclusion>
<exclusion>
<artifactId>woodstox-core</artifactId>
<groupId>com.fasterxml.woodstox</groupId>
</exclusion>
<exclusion>
<artifactId>dnsjava</artifactId>
<groupId>dnsjava</groupId>
</exclusion>
<exclusion>
<artifactId>snappy-java</artifactId>
<groupId>org.xerial.snappy</groupId>
</exclusion>
<exclusion>
<artifactId>hadoop-annotations</artifactId>
<groupId>org.apache.hadoop</groupId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
<properties>
<seaweedfs.client.version>3.80</seaweedfs.client.version>
<hadoop.version>3.4.0</hadoop.version>
</properties>
</project>

195
other/java/hdfs2/pom.xml

@ -1,195 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<properties>
<seaweedfs.client.version>3.80.1-SNAPSHOT</seaweedfs.client.version>
<hadoop.version>3.4.0</hadoop.version>
</properties>
<groupId>com.seaweedfs</groupId>
<artifactId>seaweedfs-hadoop2-client</artifactId>
<version>${seaweedfs.client.version}</version>
<name>SeaweedFS HDFS2 Client</name>
<description>A java client for SeaweedFS.</description>
<url>https://github.com/seaweedfs/seaweedfs</url>
<licenses>
<license>
<name>The Apache License, Version 2.0</name>
<url>http://www.apache.org/licenses/LICENSE-2.0.txt</url>
</license>
</licenses>
<developers>
<developer>
<name>Chris Lu</name>
<email>chris.lu@gmail.com</email>
<organization>SeaweedFS</organization>
<organizationUrl>https://seaweedfs.com</organizationUrl>
</developer>
</developers>
<scm>
<connection>scm:git:git://github.com/seaweedfs/seaweedfs.git</connection>
<developerConnection>scm:git:ssh://github.com:seaweedfs/seaweedfs.git</developerConnection>
<url>https://github.com/seaweedfs/seaweedfs/tree/master</url>
</scm>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>8</source>
<target>8</target>
<release>8</release>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
<exclude>org/slf4j/**</exclude>
<exclude>META-INF/maven/org.slf4j/**</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
</transformers>
<relocations>
<relocation>
<pattern>com.google</pattern>
<shadedPattern>shaded.com.google</shadedPattern>
</relocation>
<relocation>
<pattern>io.grpc.internal</pattern>
<shadedPattern>shaded.io.grpc.internal</shadedPattern>
</relocation>
<relocation>
<pattern>org.apache.commons</pattern>
<shadedPattern>shaded.org.apache.commons</shadedPattern>
<excludes>
<exclude>org.apache.hadoop</exclude>
<exclude>org.apache.log4j</exclude>
</excludes>
</relocation>
<relocation>
<pattern>org.apache.http</pattern>
<shadedPattern>shaded.org.apache.http</shadedPattern>
</relocation>
</relocations>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-gpg-plugin</artifactId>
<version>1.5</version>
<executions>
<execution>
<id>sign-artifacts</id>
<phase>verify</phase>
<goals>
<goal>sign</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.sonatype.central</groupId>
<artifactId>central-publishing-maven-plugin</artifactId>
<version>0.5.0</version>
<extensions>true</extensions>
<configuration>
<publishingServerId>central</publishingServerId>
<autoPublish>true</autoPublish>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-source-plugin</artifactId>
<version>2.2.1</version>
<executions>
<execution>
<id>attach-sources</id>
<goals>
<goal>jar-no-fork</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-javadoc-plugin</artifactId>
<version>3.0.1</version>
<executions>
<execution>
<id>attach-javadocs</id>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.seaweedfs</groupId>
<artifactId>seaweedfs-client</artifactId>
<version>${seaweedfs.client.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.13.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-core</artifactId>
<version>3.12.4</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
<scope>test</scope>
<type>test-jar</type>
</dependency>
</dependencies>
</project>

25
other/java/hdfs2/src/main/java/seaweed/hdfs/BufferedByteBufferReadableInputStream.java

@ -1,25 +0,0 @@
package seaweed.hdfs;
import org.apache.hadoop.fs.*;
import java.io.IOException;
import java.nio.ByteBuffer;
public class BufferedByteBufferReadableInputStream extends BufferedFSInputStream implements ByteBufferReadable {
public BufferedByteBufferReadableInputStream(FSInputStream in, int size) {
super(in, size);
if (!(in instanceof Seekable) || !(in instanceof PositionedReadable)) {
throw new IllegalArgumentException("In is not an instance of Seekable or PositionedReadable");
}
}
@Override
public int read(ByteBuffer buf) throws IOException {
if (this.in instanceof ByteBufferReadable) {
return ((ByteBufferReadable)this.in).read(buf);
} else {
throw new UnsupportedOperationException("Byte-buffer read unsupported by input stream");
}
}
}

35
other/java/hdfs2/src/main/java/seaweed/hdfs/SeaweedAbstractFileSystem.java

@ -1,35 +0,0 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package seaweed.hdfs;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.DelegateToFileSystem;
import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;
public class SeaweedAbstractFileSystem extends DelegateToFileSystem {
SeaweedAbstractFileSystem(final URI uri, final Configuration conf)
throws IOException, URISyntaxException {
super(uri, new SeaweedFileSystem(), conf, "seaweedfs", false);
}
}

643
other/java/hdfs2/src/main/java/seaweed/hdfs/SeaweedFileSystem.java

@ -1,643 +0,0 @@
package seaweed.hdfs;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.fs.permission.AclEntry;
import org.apache.hadoop.fs.permission.AclStatus;
import org.apache.hadoop.fs.permission.FsPermission;
import org.apache.hadoop.security.UserGroupInformation;
import org.apache.hadoop.util.Progressable;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import seaweedfs.client.FilerProto;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.OutputStream;
import java.net.URI;
import java.util.EnumSet;
import java.util.List;
import java.util.Map;
public class SeaweedFileSystem extends FileSystem {
public static final String FS_SEAWEED_FILER_HOST = "fs.seaweed.filer.host";
public static final String FS_SEAWEED_FILER_PORT = "fs.seaweed.filer.port";
public static final String FS_SEAWEED_FILER_PORT_GRPC = "fs.seaweed.filer.port.grpc";
public static final int FS_SEAWEED_DEFAULT_PORT = 8888;
public static final String FS_SEAWEED_BUFFER_SIZE = "fs.seaweed.buffer.size";
public static final String FS_SEAWEED_REPLICATION = "fs.seaweed.replication";
public static final String FS_SEAWEED_VOLUME_SERVER_ACCESS = "fs.seaweed.volume.server.access";
public static final int FS_SEAWEED_DEFAULT_BUFFER_SIZE = 4 * 1024 * 1024;
public static final String FS_SEAWEED_FILER_CN = "fs.seaweed.filer.cn";
private static final Logger LOG = LoggerFactory.getLogger(SeaweedFileSystem.class);
private URI uri;
private Path workingDirectory = new Path("/");
private SeaweedFileSystemStore seaweedFileSystemStore;
public URI getUri() {
return uri;
}
public String getScheme() {
return "seaweedfs";
}
@Override
public void initialize(URI uri, Configuration conf) throws IOException { // get
super.initialize(uri, conf);
// get host information from uri (overrides info in conf)
String host = uri.getHost();
host = (host == null) ? conf.get(FS_SEAWEED_FILER_HOST, "localhost") : host;
conf.set(FS_SEAWEED_FILER_HOST, host);
// get port information from uri, (overrides info in conf)
int port = uri.getPort();
port = (port == -1) ? FS_SEAWEED_DEFAULT_PORT : port;
conf.setInt(FS_SEAWEED_FILER_PORT, port);
int grpcPort = conf.getInt(FS_SEAWEED_FILER_PORT_GRPC, port + 10000);
setConf(conf);
this.uri = uri;
String cn = conf.get(FS_SEAWEED_FILER_CN, "");
seaweedFileSystemStore = new SeaweedFileSystemStore(host, port, grpcPort, cn, conf);
}
@Override
public void close() throws IOException {
super.close();
this.seaweedFileSystemStore.close();
}
@Override
public FSDataInputStream open(Path path, int bufferSize) throws IOException {
LOG.debug("open path: {} bufferSize:{}", path, bufferSize);
path = qualify(path);
try {
int seaweedBufferSize = this.getConf().getInt(FS_SEAWEED_BUFFER_SIZE, FS_SEAWEED_DEFAULT_BUFFER_SIZE);
FSInputStream inputStream = seaweedFileSystemStore.openFileForRead(path, statistics);
return new FSDataInputStream(new BufferedByteBufferReadableInputStream(inputStream, 4 * seaweedBufferSize));
} catch (Exception ex) {
LOG.error("Failed to open file: {} bufferSize:{}", path, bufferSize, ex);
throw new IOException("Failed to open file: " + path, ex);
}
}
@Override
public FSDataOutputStream create(Path path, FsPermission permission, final boolean overwrite, final int bufferSize,
final short replication, final long blockSize, final Progressable progress) throws IOException {
LOG.debug("create path: {} bufferSize:{} blockSize:{}", path, bufferSize, blockSize);
path = qualify(path);
try {
// Priority: 1) non-empty FS_SEAWEED_REPLICATION, 2) empty string -> filer
// default, 3) null -> HDFS replication
String replicaPlacement = this.getConf().get(FS_SEAWEED_REPLICATION);
if (replicaPlacement == null) {
// Not configured, use HDFS replication parameter. This creates a "00N"
// replication string,
// placing N (replication-1) extra replicas on different servers in the same
// rack.
replicaPlacement = String.format("%03d", replication - 1);
}
int seaweedBufferSize = this.getConf().getInt(FS_SEAWEED_BUFFER_SIZE, FS_SEAWEED_DEFAULT_BUFFER_SIZE);
OutputStream outputStream = seaweedFileSystemStore.createFile(path, overwrite, permission,
seaweedBufferSize, replicaPlacement);
return new FSDataOutputStream(outputStream, statistics);
} catch (Exception ex) {
LOG.error("Failed to create file: {} bufferSize:{} blockSize:{}", path, bufferSize, blockSize, ex);
throw new IOException("Failed to create file: " + path, ex);
}
}
/**
* {@inheritDoc}
*
* @throws FileNotFoundException if the parent directory is not present -or
* is not a directory.
*/
@Override
public FSDataOutputStream createNonRecursive(Path path,
FsPermission permission,
EnumSet<CreateFlag> flags,
int bufferSize,
short replication,
long blockSize,
Progressable progress) throws IOException {
Path parent = path.getParent();
if (parent != null) {
// expect this to raise an exception if there is no parent
if (!getFileStatus(parent).isDirectory()) {
throw new FileAlreadyExistsException("Not a directory: " + parent);
}
}
int seaweedBufferSize = this.getConf().getInt(FS_SEAWEED_BUFFER_SIZE, FS_SEAWEED_DEFAULT_BUFFER_SIZE);
return create(path, permission,
flags.contains(CreateFlag.OVERWRITE), bufferSize,
replication, seaweedBufferSize, progress);
}
@Override
public FSDataOutputStream append(Path path, int bufferSize, Progressable progressable) throws IOException {
LOG.debug("append path: {} bufferSize:{}", path, bufferSize);
path = qualify(path);
try {
int seaweedBufferSize = this.getConf().getInt(FS_SEAWEED_BUFFER_SIZE, FS_SEAWEED_DEFAULT_BUFFER_SIZE);
OutputStream outputStream = seaweedFileSystemStore.createFile(path, false, null, seaweedBufferSize, "");
return new FSDataOutputStream(outputStream, statistics);
} catch (Exception ex) {
LOG.error("Failed to append to file: {} bufferSize:{}", path, bufferSize, ex);
throw new IOException("Failed to append to file: " + path, ex);
}
}
@Override
public boolean rename(Path src, Path dst) throws IOException {
LOG.debug("rename path: {} => {}", src, dst);
if (src.isRoot()) {
return false;
}
if (src.equals(dst)) {
return true;
}
FilerProto.Entry entry = seaweedFileSystemStore.lookupEntry(dst);
Path adjustedDst = dst;
if (entry != null) {
FileStatus dstFileStatus = getFileStatus(dst);
String sourceFileName = src.getName();
if (!dstFileStatus.isDirectory()) {
return false;
}
adjustedDst = new Path(dst, sourceFileName);
}
Path qualifiedSrcPath = qualify(src);
Path qualifiedDstPath = qualify(adjustedDst);
seaweedFileSystemStore.rename(qualifiedSrcPath, qualifiedDstPath);
return true;
}
@Override
public boolean delete(Path path, boolean recursive) throws IOException {
LOG.debug("delete path: {} recursive:{}", path, recursive);
path = qualify(path);
FilerProto.Entry entry = seaweedFileSystemStore.lookupEntry(path);
if (entry == null) {
return true;
}
FileStatus fileStatus = getFileStatus(path);
return seaweedFileSystemStore.deleteEntries(path, fileStatus.isDirectory(), recursive);
}
@Override
public FileStatus[] listStatus(Path path) throws IOException {
LOG.debug("listStatus path: {}", path);
path = qualify(path);
return seaweedFileSystemStore.listEntries(path);
}
@Override
public Path getWorkingDirectory() {
return workingDirectory;
}
@Override
public void setWorkingDirectory(Path path) {
if (path.isAbsolute()) {
workingDirectory = path;
} else {
workingDirectory = new Path(workingDirectory, path);
}
}
@Override
public boolean mkdirs(Path path, FsPermission fsPermission) throws IOException {
LOG.debug("mkdirs path: {}", path);
path = qualify(path);
FilerProto.Entry entry = seaweedFileSystemStore.lookupEntry(path);
if (entry == null) {
UserGroupInformation currentUser = UserGroupInformation.getCurrentUser();
return seaweedFileSystemStore.createDirectory(path, currentUser,
fsPermission == null ? FsPermission.getDirDefault() : fsPermission,
FsPermission.getUMask(getConf()));
}
FileStatus fileStatus = getFileStatus(path);
if (fileStatus.isDirectory()) {
return true;
} else {
throw new FileAlreadyExistsException("Path is a file: " + path);
}
}
@Override
public FileStatus getFileStatus(Path path) throws IOException {
LOG.debug("getFileStatus path: {}", path);
path = qualify(path);
return seaweedFileSystemStore.getFileStatus(path);
}
/**
* Set owner of a path (i.e. a file or a directory).
* The parameters owner and group cannot both be null.
*
* @param path The path
* @param owner If it is null, the original username remains unchanged.
* @param group If it is null, the original groupname remains unchanged.
*/
@Override
public void setOwner(Path path, final String owner, final String group)
throws IOException {
LOG.debug("setOwner path: {}", path);
path = qualify(path);
seaweedFileSystemStore.setOwner(path, owner, group);
}
/**
* Set permission of a path.
*
* @param path The path
* @param permission Access permission
*/
@Override
public void setPermission(Path path, final FsPermission permission) throws IOException {
LOG.debug("setPermission path: {}", path);
if (permission == null) {
throw new IllegalArgumentException("The permission can't be null");
}
path = qualify(path);
seaweedFileSystemStore.setPermission(path, permission);
}
Path qualify(Path path) {
return path.makeQualified(uri, workingDirectory);
}
/**
* Concat existing files together.
*
* @param trg the path to the target destination.
* @param psrcs the paths to the sources to use for the concatenation.
* @throws IOException IO failure
* @throws UnsupportedOperationException if the operation is unsupported
* (default).
*/
@Override
public void concat(final Path trg, final Path[] psrcs) throws IOException {
throw new UnsupportedOperationException("Not implemented by the " +
getClass().getSimpleName() + " FileSystem implementation");
}
/**
* Truncate the file in the indicated path to the indicated size.
* <ul>
* <li>Fails if path is a directory.</li>
* <li>Fails if path does not exist.</li>
* <li>Fails if path is not closed.</li>
* <li>Fails if new size is greater than current size.</li>
* </ul>
*
* @param f The path to the file to be truncated
* @param newLength The size the file is to be truncated to
* @return <code>true</code> if the file has been truncated to the desired
* <code>newLength</code> and is immediately available to be reused for
* write operations such as <code>append</code>, or
* <code>false</code> if a background process of adjusting the length of
* the last block has been started, and clients should wait for it to
* complete before proceeding with further file updates.
* @throws IOException IO failure
* @throws UnsupportedOperationException if the operation is unsupported
* (default).
*/
@Override
public boolean truncate(Path f, long newLength) throws IOException {
throw new UnsupportedOperationException("Not implemented by the " +
getClass().getSimpleName() + " FileSystem implementation");
}
@Override
public void createSymlink(final Path target, final Path link,
final boolean createParent) throws IOException {
// Supporting filesystems should override this method
throw new UnsupportedOperationException(
"Filesystem does not support symlinks!");
}
public boolean supportsSymlinks() {
return false;
}
/**
* Create a snapshot.
*
* @param path The directory where snapshots will be taken.
* @param snapshotName The name of the snapshot
* @return the snapshot path.
* @throws IOException IO failure
* @throws UnsupportedOperationException if the operation is unsupported
*/
@Override
public Path createSnapshot(Path path, String snapshotName)
throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support createSnapshot");
}
/**
* Rename a snapshot.
*
* @param path The directory path where the snapshot was taken
* @param snapshotOldName Old name of the snapshot
* @param snapshotNewName New name of the snapshot
* @throws IOException IO failure
* @throws UnsupportedOperationException if the operation is unsupported
* (default outcome).
*/
@Override
public void renameSnapshot(Path path, String snapshotOldName,
String snapshotNewName) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support renameSnapshot");
}
/**
* Delete a snapshot of a directory.
*
* @param path The directory that the to-be-deleted snapshot belongs to
* @param snapshotName The name of the snapshot
* @throws IOException IO failure
* @throws UnsupportedOperationException if the operation is unsupported
* (default outcome).
*/
@Override
public void deleteSnapshot(Path path, String snapshotName)
throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support deleteSnapshot");
}
/**
* Modifies ACL entries of files and directories. This method can add new ACL
* entries or modify the permissions on existing ACL entries. All existing
* ACL entries that are not specified in this call are retained without
* changes. (Modifications are merged into the current ACL.)
*
* @param path Path to modify
* @param aclSpec List&lt;AclEntry&gt; describing modifications
* @throws IOException if an ACL could not be modified
* @throws UnsupportedOperationException if the operation is unsupported
* (default outcome).
*/
@Override
public void modifyAclEntries(Path path, List<AclEntry> aclSpec)
throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support modifyAclEntries");
}
/**
* Removes ACL entries from files and directories. Other ACL entries are
* retained.
*
* @param path Path to modify
* @param aclSpec List describing entries to remove
* @throws IOException if an ACL could not be modified
* @throws UnsupportedOperationException if the operation is unsupported
* (default outcome).
*/
@Override
public void removeAclEntries(Path path, List<AclEntry> aclSpec)
throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support removeAclEntries");
}
/**
* Removes all default ACL entries from files and directories.
*
* @param path Path to modify
* @throws IOException if an ACL could not be modified
* @throws UnsupportedOperationException if the operation is unsupported
* (default outcome).
*/
@Override
public void removeDefaultAcl(Path path)
throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support removeDefaultAcl");
}
/**
* Removes all but the base ACL entries of files and directories. The entries
* for user, group, and others are retained for compatibility with permission
* bits.
*
* @param path Path to modify
* @throws IOException if an ACL could not be removed
* @throws UnsupportedOperationException if the operation is unsupported
* (default outcome).
*/
@Override
public void removeAcl(Path path)
throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support removeAcl");
}
/**
* Fully replaces ACL of files and directories, discarding all existing
* entries.
*
* @param path Path to modify
* @param aclSpec List describing modifications, which must include entries
* for user, group, and others for compatibility with permission
* bits.
* @throws IOException if an ACL could not be modified
* @throws UnsupportedOperationException if the operation is unsupported
* (default outcome).
*/
@Override
public void setAcl(Path path, List<AclEntry> aclSpec) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support setAcl");
}
/**
* Gets the ACL of a file or directory.
*
* @param path Path to get
* @return AclStatus describing the ACL of the file or directory
* @throws IOException if an ACL could not be read
* @throws UnsupportedOperationException if the operation is unsupported
* (default outcome).
*/
@Override
public AclStatus getAclStatus(Path path) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support getAclStatus");
}
/**
* Set an xattr of a file or directory.
* The name must be prefixed with the namespace followed by ".". For example,
* "user.attr".
* <p>
* Refer to the HDFS extended attributes user documentation for details.
*
* @param path Path to modify
* @param name xattr name.
* @param value xattr value.
* @param flag xattr set flag
* @throws IOException IO failure
* @throws UnsupportedOperationException if the operation is unsupported
* (default outcome).
*/
@Override
public void setXAttr(Path path, String name, byte[] value,
EnumSet<XAttrSetFlag> flag) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support setXAttr");
}
/**
* Get an xattr name and value for a file or directory.
* The name must be prefixed with the namespace followed by ".". For example,
* "user.attr".
* <p>
* Refer to the HDFS extended attributes user documentation for details.
*
* @param path Path to get extended attribute
* @param name xattr name.
* @return byte[] xattr value.
* @throws IOException IO failure
* @throws UnsupportedOperationException if the operation is unsupported
* (default outcome).
*/
@Override
public byte[] getXAttr(Path path, String name) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support getXAttr");
}
/**
* Get all of the xattr name/value pairs for a file or directory.
* Only those xattrs which the logged-in user has permissions to view
* are returned.
* <p>
* Refer to the HDFS extended attributes user documentation for details.
*
* @param path Path to get extended attributes
* @return Map describing the XAttrs of the file or directory
* @throws IOException IO failure
* @throws UnsupportedOperationException if the operation is unsupported
* (default outcome).
*/
@Override
public Map<String, byte[]> getXAttrs(Path path) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support getXAttrs");
}
/**
* Get all of the xattrs name/value pairs for a file or directory.
* Only those xattrs which the logged-in user has permissions to view
* are returned.
* <p>
* Refer to the HDFS extended attributes user documentation for details.
*
* @param path Path to get extended attributes
* @param names XAttr names.
* @return Map describing the XAttrs of the file or directory
* @throws IOException IO failure
* @throws UnsupportedOperationException if the operation is unsupported
* (default outcome).
*/
@Override
public Map<String, byte[]> getXAttrs(Path path, List<String> names)
throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support getXAttrs");
}
/**
* Get all of the xattr names for a file or directory.
* Only those xattr names which the logged-in user has permissions to view
* are returned.
* <p>
* Refer to the HDFS extended attributes user documentation for details.
*
* @param path Path to get extended attributes
* @return List{@literal <String>} of the XAttr names of the file or directory
* @throws IOException IO failure
* @throws UnsupportedOperationException if the operation is unsupported
* (default outcome).
*/
@Override
public List<String> listXAttrs(Path path) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support listXAttrs");
}
/**
* Remove an xattr of a file or directory.
* The name must be prefixed with the namespace followed by ".". For example,
* "user.attr".
* <p>
* Refer to the HDFS extended attributes user documentation for details.
*
* @param path Path to remove extended attribute
* @param name xattr name
* @throws IOException IO failure
* @throws UnsupportedOperationException if the operation is unsupported
* (default outcome).
*/
@Override
public void removeXAttr(Path path, String name) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support removeXAttr");
}
}

291
other/java/hdfs2/src/main/java/seaweed/hdfs/SeaweedFileSystemStore.java

@ -1,291 +0,0 @@
package seaweed.hdfs;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSInputStream;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.permission.FsPermission;
import org.apache.hadoop.security.UserGroupInformation;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import seaweedfs.client.*;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.OutputStream;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import static seaweed.hdfs.SeaweedFileSystem.*;
public class SeaweedFileSystemStore {
private static final Logger LOG = LoggerFactory.getLogger(SeaweedFileSystemStore.class);
private FilerClient filerClient;
private Configuration conf;
public SeaweedFileSystemStore(String host, int port, int grpcPort, String cn, Configuration conf) {
filerClient = new FilerClient(host, port, grpcPort, cn);
this.conf = conf;
String volumeServerAccessMode = this.conf.get(FS_SEAWEED_VOLUME_SERVER_ACCESS, "direct");
if (volumeServerAccessMode.equals("publicUrl")) {
filerClient.setAccessVolumeServerByPublicUrl();
} else if (volumeServerAccessMode.equals("filerProxy")) {
filerClient.setAccessVolumeServerByFilerProxy();
}
}
public void close() {
try {
this.filerClient.shutdown();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
public static String getParentDirectory(Path path) {
return path.isRoot() ? "/" : path.getParent().toUri().getPath();
}
static int permissionToMode(FsPermission permission, boolean isDirectory) {
int p = permission.toShort();
if (isDirectory) {
p = p | 1 << 31;
}
return p;
}
public boolean createDirectory(final Path path, UserGroupInformation currentUser,
final FsPermission permission, final FsPermission umask) {
LOG.debug("createDirectory path: {} permission: {} umask: {}",
path,
permission,
umask);
return filerClient.mkdirs(
path.toUri().getPath(),
permissionToMode(permission, true),
currentUser.getUserName(),
currentUser.getGroupNames()
);
}
public FileStatus[] listEntries(final Path path) throws IOException {
LOG.debug("listEntries path: {}", path);
FileStatus pathStatus = getFileStatus(path);
if (pathStatus == null) {
return new FileStatus[0];
}
if (!pathStatus.isDirectory()) {
return new FileStatus[]{pathStatus};
}
List<FileStatus> fileStatuses = new ArrayList<FileStatus>();
List<FilerProto.Entry> entries = filerClient.listEntries(path.toUri().getPath());
for (FilerProto.Entry entry : entries) {
FileStatus fileStatus = doGetFileStatus(new Path(path, entry.getName()), entry);
fileStatuses.add(fileStatus);
}
LOG.debug("listEntries path: {} size {}", fileStatuses, fileStatuses.size());
return fileStatuses.toArray(new FileStatus[0]);
}
public FileStatus getFileStatus(final Path path) throws IOException {
FilerProto.Entry entry = lookupEntry(path);
if (entry == null) {
throw new FileNotFoundException("File does not exist: " + path);
}
LOG.debug("doGetFileStatus path:{} entry:{}", path, entry);
FileStatus fileStatus = doGetFileStatus(path, entry);
return fileStatus;
}
public boolean deleteEntries(final Path path, boolean isDirectory, boolean recursive) {
LOG.debug("deleteEntries path: {} isDirectory {} recursive: {}",
path,
String.valueOf(isDirectory),
String.valueOf(recursive));
if (path.isRoot()) {
return true;
}
if (recursive && isDirectory) {
List<FilerProto.Entry> entries = filerClient.listEntries(path.toUri().getPath());
for (FilerProto.Entry entry : entries) {
deleteEntries(new Path(path, entry.getName()), entry.getIsDirectory(), true);
}
}
return filerClient.deleteEntry(getParentDirectory(path), path.getName(), true, recursive, true);
}
private FileStatus doGetFileStatus(Path path, FilerProto.Entry entry) {
FilerProto.FuseAttributes attributes = entry.getAttributes();
long length = SeaweedRead.fileSize(entry);
boolean isDir = entry.getIsDirectory();
int block_replication = 1;
int blocksize = this.conf.getInt(FS_SEAWEED_BUFFER_SIZE, FS_SEAWEED_DEFAULT_BUFFER_SIZE);
long modification_time = attributes.getMtime() * 1000; // milliseconds
long access_time = 0;
FsPermission permission = FsPermission.createImmutable((short) attributes.getFileMode());
String owner = attributes.getUserName();
String group = attributes.getGroupNameCount() > 0 ? attributes.getGroupName(0) : "";
return new FileStatus(length, isDir, block_replication, blocksize,
modification_time, access_time, permission, owner, group, null, path);
}
public FilerProto.Entry lookupEntry(Path path) {
return filerClient.lookupEntry(getParentDirectory(path), path.getName());
}
public void rename(Path source, Path destination) {
LOG.debug("rename source: {} destination:{}", source, destination);
if (source.isRoot()) {
return;
}
LOG.info("rename source: {} destination:{}", source, destination);
FilerProto.Entry entry = lookupEntry(source);
if (entry == null) {
LOG.warn("rename non-existing source: {}", source);
return;
}
filerClient.mv(source.toUri().getPath(), destination.toUri().getPath());
}
public OutputStream createFile(final Path path,
final boolean overwrite,
FsPermission permission,
int bufferSize,
String replication) throws IOException {
permission = permission == null ? FsPermission.getFileDefault() : permission;
LOG.debug("createFile path: {} overwrite: {} permission: {}",
path,
overwrite,
permission.toString());
UserGroupInformation userGroupInformation = UserGroupInformation.getCurrentUser();
long now = System.currentTimeMillis() / 1000L;
FilerProto.Entry.Builder entry = null;
long writePosition = 0;
if (!overwrite) {
FilerProto.Entry existingEntry = lookupEntry(path);
LOG.debug("createFile merged entry path:{} existingEntry:{}", path, existingEntry);
if (existingEntry != null) {
entry = FilerProto.Entry.newBuilder();
entry.mergeFrom(existingEntry);
entry.clearContent();
entry.getAttributesBuilder().setMtime(now);
LOG.debug("createFile merged entry path:{} entry:{} from:{}", path, entry, existingEntry);
writePosition = SeaweedRead.fileSize(existingEntry);
}
}
if (entry == null) {
entry = FilerProto.Entry.newBuilder()
.setName(path.getName())
.setIsDirectory(false)
.setAttributes(FilerProto.FuseAttributes.newBuilder()
.setFileMode(permissionToMode(permission, false))
.setCrtime(now)
.setMtime(now)
.setUserName(userGroupInformation.getUserName())
.clearGroupName()
.addAllGroupName(Arrays.asList(userGroupInformation.getGroupNames()))
);
SeaweedWrite.writeMeta(filerClient, getParentDirectory(path), entry);
}
return new SeaweedHadoopOutputStream(filerClient, path.toString(), entry, writePosition, bufferSize, replication);
}
public FSInputStream openFileForRead(final Path path, FileSystem.Statistics statistics) throws IOException {
LOG.debug("openFileForRead path:{}", path);
FilerProto.Entry entry = lookupEntry(path);
if (entry == null) {
throw new FileNotFoundException("read non-exist file " + path);
}
return new SeaweedHadoopInputStream(filerClient,
statistics,
path.toUri().getPath(),
entry);
}
public void setOwner(Path path, String owner, String group) {
LOG.debug("setOwner path:{} owner:{} group:{}", path, owner, group);
FilerProto.Entry entry = lookupEntry(path);
if (entry == null) {
LOG.debug("setOwner path:{} entry:{}", path, entry);
return;
}
FilerProto.Entry.Builder entryBuilder = entry.toBuilder();
FilerProto.FuseAttributes.Builder attributesBuilder = entry.getAttributes().toBuilder();
if (owner != null) {
attributesBuilder.setUserName(owner);
}
if (group != null) {
attributesBuilder.clearGroupName();
attributesBuilder.addGroupName(group);
}
entryBuilder.setAttributes(attributesBuilder);
LOG.debug("setOwner path:{} entry:{}", path, entryBuilder);
filerClient.updateEntry(getParentDirectory(path), entryBuilder.build());
}
public void setPermission(Path path, FsPermission permission) {
LOG.debug("setPermission path:{} permission:{}", path, permission);
FilerProto.Entry entry = lookupEntry(path);
if (entry == null) {
LOG.debug("setPermission path:{} entry:{}", path, entry);
return;
}
FilerProto.Entry.Builder entryBuilder = entry.toBuilder();
FilerProto.FuseAttributes.Builder attributesBuilder = entry.getAttributes().toBuilder();
attributesBuilder.setFileMode(permissionToMode(permission, entry.getIsDirectory()));
entryBuilder.setAttributes(attributesBuilder);
LOG.debug("setPermission path:{} entry:{}", path, entryBuilder);
filerClient.updateEntry(getParentDirectory(path), entryBuilder.build());
}
}

150
other/java/hdfs2/src/main/java/seaweed/hdfs/SeaweedHadoopInputStream.java

@ -1,150 +0,0 @@
package seaweed.hdfs;
// based on org.apache.hadoop.fs.azurebfs.services.AbfsInputStream
import org.apache.hadoop.fs.ByteBufferReadable;
import org.apache.hadoop.fs.FSInputStream;
import org.apache.hadoop.fs.FileSystem.Statistics;
import seaweedfs.client.FilerClient;
import seaweedfs.client.FilerProto;
import seaweedfs.client.SeaweedInputStream;
import java.io.EOFException;
import java.io.IOException;
import java.nio.ByteBuffer;
public class SeaweedHadoopInputStream extends FSInputStream implements ByteBufferReadable {
private final SeaweedInputStream seaweedInputStream;
private final Statistics statistics;
public SeaweedHadoopInputStream(
final FilerClient filerClient,
final Statistics statistics,
final String path,
final FilerProto.Entry entry) throws IOException {
this.seaweedInputStream = new SeaweedInputStream(filerClient, path, entry);
this.statistics = statistics;
}
@Override
public int read() throws IOException {
return seaweedInputStream.read();
}
@Override
public int read(final byte[] b, final int off, final int len) throws IOException {
return seaweedInputStream.read(b, off, len);
}
// implement ByteBufferReadable
@Override
public synchronized int read(ByteBuffer buf) throws IOException {
int bytesRead = seaweedInputStream.read(buf);
if (bytesRead > 0) {
if (statistics != null) {
statistics.incrementBytesRead(bytesRead);
}
}
return bytesRead;
}
/**
* Seek to given position in stream.
*
* @param n position to seek to
* @throws IOException if there is an error
* @throws EOFException if attempting to seek past end of file
*/
@Override
public synchronized void seek(long n) throws IOException {
seaweedInputStream.seek(n);
}
@Override
public synchronized long skip(long n) throws IOException {
return seaweedInputStream.skip(n);
}
/**
* Return the size of the remaining available bytes
* if the size is less than or equal to {@link Integer#MAX_VALUE},
* otherwise, return {@link Integer#MAX_VALUE}.
* <p>
* This is to match the behavior of DFSInputStream.available(),
* which some clients may rely on (HBase write-ahead log reading in
* particular).
*/
@Override
public synchronized int available() throws IOException {
return seaweedInputStream.available();
}
/**
* Returns the length of the file that this stream refers to. Note that the length returned is the length
* as of the time the Stream was opened. Specifically, if there have been subsequent appends to the file,
* they wont be reflected in the returned length.
*
* @return length of the file.
* @throws IOException if the stream is closed
*/
public long length() throws IOException {
return seaweedInputStream.length();
}
/**
* Return the current offset from the start of the file
*
* @throws IOException throws {@link IOException} if there is an error
*/
@Override
public synchronized long getPos() throws IOException {
return seaweedInputStream.getPos();
}
/**
* Seeks a different copy of the data. Returns true if
* found a new source, false otherwise.
*
* @throws IOException throws {@link IOException} if there is an error
*/
@Override
public boolean seekToNewSource(long l) throws IOException {
return false;
}
@Override
public synchronized void close() throws IOException {
seaweedInputStream.close();
}
/**
* Not supported by this stream. Throws {@link UnsupportedOperationException}
*
* @param readlimit ignored
*/
@Override
public synchronized void mark(int readlimit) {
throw new UnsupportedOperationException("mark()/reset() not supported on this stream");
}
/**
* Not supported by this stream. Throws {@link UnsupportedOperationException}
*/
@Override
public synchronized void reset() throws IOException {
throw new UnsupportedOperationException("mark()/reset() not supported on this stream");
}
/**
* gets whether mark and reset are supported by {@code ADLFileInputStream}. Always returns false.
*
* @return always {@code false}
*/
@Override
public boolean markSupported() {
return false;
}
}

16
other/java/hdfs2/src/main/java/seaweed/hdfs/SeaweedHadoopOutputStream.java

@ -1,16 +0,0 @@
package seaweed.hdfs;
// adapted from org.apache.hadoop.fs.azurebfs.services.AbfsOutputStream
import seaweedfs.client.FilerClient;
import seaweedfs.client.FilerProto;
import seaweedfs.client.SeaweedOutputStream;
public class SeaweedHadoopOutputStream extends SeaweedOutputStream {
public SeaweedHadoopOutputStream(FilerClient filerClient, final String path, FilerProto.Entry.Builder entry,
final long position, final int bufferSize, final String replication) {
super(filerClient, path, entry, position, bufferSize, replication);
}
}

90
other/java/hdfs2/src/test/java/seaweed/hdfs/SeaweedFileSystemConfigTest.java

@ -1,90 +0,0 @@
package seaweed.hdfs;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.junit.Before;
import org.junit.Test;
import static org.junit.Assert.*;
/**
* Unit tests for SeaweedFileSystem configuration that don't require a running SeaweedFS instance.
*
* These tests verify basic properties and constants.
*/
public class SeaweedFileSystemConfigTest {
private SeaweedFileSystem fs;
private Configuration conf;
@Before
public void setUp() {
fs = new SeaweedFileSystem();
conf = new Configuration();
}
@Test
public void testScheme() {
assertEquals("seaweedfs", fs.getScheme());
}
@Test
public void testConstants() {
// Test that constants are defined correctly
assertEquals("fs.seaweed.filer.host", SeaweedFileSystem.FS_SEAWEED_FILER_HOST);
assertEquals("fs.seaweed.filer.port", SeaweedFileSystem.FS_SEAWEED_FILER_PORT);
assertEquals("fs.seaweed.filer.port.grpc", SeaweedFileSystem.FS_SEAWEED_FILER_PORT_GRPC);
assertEquals(8888, SeaweedFileSystem.FS_SEAWEED_DEFAULT_PORT);
assertEquals("fs.seaweed.buffer.size", SeaweedFileSystem.FS_SEAWEED_BUFFER_SIZE);
assertEquals(4 * 1024 * 1024, SeaweedFileSystem.FS_SEAWEED_DEFAULT_BUFFER_SIZE);
assertEquals("fs.seaweed.replication", SeaweedFileSystem.FS_SEAWEED_REPLICATION);
assertEquals("fs.seaweed.volume.server.access", SeaweedFileSystem.FS_SEAWEED_VOLUME_SERVER_ACCESS);
assertEquals("fs.seaweed.filer.cn", SeaweedFileSystem.FS_SEAWEED_FILER_CN);
}
@Test
public void testWorkingDirectoryPathOperations() {
// Test path operations that don't require initialization
Path testPath = new Path("/test/path");
assertTrue("Path should be absolute", testPath.isAbsolute());
assertEquals("/test/path", testPath.toUri().getPath());
Path childPath = new Path(testPath, "child");
assertEquals("/test/path/child", childPath.toUri().getPath());
}
@Test
public void testConfigurationProperties() {
// Test that configuration can be set and read
conf.set(SeaweedFileSystem.FS_SEAWEED_FILER_HOST, "testhost");
assertEquals("testhost", conf.get(SeaweedFileSystem.FS_SEAWEED_FILER_HOST));
conf.setInt(SeaweedFileSystem.FS_SEAWEED_FILER_PORT, 9999);
assertEquals(9999, conf.getInt(SeaweedFileSystem.FS_SEAWEED_FILER_PORT, 0));
conf.setInt(SeaweedFileSystem.FS_SEAWEED_BUFFER_SIZE, 8 * 1024 * 1024);
assertEquals(8 * 1024 * 1024, conf.getInt(SeaweedFileSystem.FS_SEAWEED_BUFFER_SIZE, 0));
conf.set(SeaweedFileSystem.FS_SEAWEED_REPLICATION, "001");
assertEquals("001", conf.get(SeaweedFileSystem.FS_SEAWEED_REPLICATION));
conf.set(SeaweedFileSystem.FS_SEAWEED_VOLUME_SERVER_ACCESS, "publicUrl");
assertEquals("publicUrl", conf.get(SeaweedFileSystem.FS_SEAWEED_VOLUME_SERVER_ACCESS));
conf.set(SeaweedFileSystem.FS_SEAWEED_FILER_CN, "test-cn");
assertEquals("test-cn", conf.get(SeaweedFileSystem.FS_SEAWEED_FILER_CN));
}
@Test
public void testDefaultBufferSize() {
// Test default buffer size constant
int expected = 4 * 1024 * 1024; // 4MB
assertEquals(expected, SeaweedFileSystem.FS_SEAWEED_DEFAULT_BUFFER_SIZE);
}
@Test
public void testDefaultPort() {
// Test default port constant
assertEquals(8888, SeaweedFileSystem.FS_SEAWEED_DEFAULT_PORT);
}
}

379
other/java/hdfs2/src/test/java/seaweed/hdfs/SeaweedFileSystemTest.java

@ -1,379 +0,0 @@
package seaweed.hdfs;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.permission.FsPermission;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import java.io.IOException;
import java.net.URI;
import static org.junit.Assert.*;
/**
* Unit tests for SeaweedFileSystem.
*
* These tests verify basic FileSystem operations against a SeaweedFS backend.
* Note: These tests require a running SeaweedFS filer instance.
*
* To run tests, ensure SeaweedFS is running with default ports:
* - Filer HTTP: 8888
* - Filer gRPC: 18888
*
* Set environment variable SEAWEEDFS_TEST_ENABLED=true to enable these tests.
*/
public class SeaweedFileSystemTest {
private SeaweedFileSystem fs;
private Configuration conf;
private static final String TEST_ROOT = "/test-hdfs2";
private static final boolean TESTS_ENABLED =
"true".equalsIgnoreCase(System.getenv("SEAWEEDFS_TEST_ENABLED"));
@Before
public void setUp() throws Exception {
if (!TESTS_ENABLED) {
return;
}
conf = new Configuration();
conf.set("fs.seaweed.filer.host", "localhost");
conf.setInt("fs.seaweed.filer.port", 8888);
conf.setInt("fs.seaweed.filer.port.grpc", 18888);
fs = new SeaweedFileSystem();
URI uri = new URI("seaweedfs://localhost:8888/");
fs.initialize(uri, conf);
// Clean up any existing test directory
Path testPath = new Path(TEST_ROOT);
if (fs.exists(testPath)) {
fs.delete(testPath, true);
}
}
@After
public void tearDown() throws Exception {
if (!TESTS_ENABLED || fs == null) {
return;
}
// Clean up test directory
Path testPath = new Path(TEST_ROOT);
if (fs.exists(testPath)) {
fs.delete(testPath, true);
}
fs.close();
}
@Test
public void testInitialization() throws Exception {
if (!TESTS_ENABLED) {
System.out.println("Skipping test - SEAWEEDFS_TEST_ENABLED not set");
return;
}
assertNotNull(fs);
assertEquals("seaweedfs", fs.getScheme());
assertNotNull(fs.getUri());
assertEquals("/", fs.getWorkingDirectory().toUri().getPath());
}
@Test
public void testMkdirs() throws Exception {
if (!TESTS_ENABLED) {
System.out.println("Skipping test - SEAWEEDFS_TEST_ENABLED not set");
return;
}
Path testDir = new Path(TEST_ROOT + "/testdir");
assertTrue("Failed to create directory", fs.mkdirs(testDir));
assertTrue("Directory should exist", fs.exists(testDir));
FileStatus status = fs.getFileStatus(testDir);
assertTrue("Path should be a directory", status.isDirectory());
}
@Test
public void testCreateAndReadFile() throws Exception {
if (!TESTS_ENABLED) {
System.out.println("Skipping test - SEAWEEDFS_TEST_ENABLED not set");
return;
}
Path testFile = new Path(TEST_ROOT + "/testfile.txt");
String testContent = "Hello, SeaweedFS!";
// Create and write to file
FSDataOutputStream out = fs.create(testFile, FsPermission.getDefault(),
false, 4096, (short) 1, 4 * 1024 * 1024, null);
assertNotNull("Output stream should not be null", out);
out.write(testContent.getBytes());
out.close();
// Verify file exists
assertTrue("File should exist", fs.exists(testFile));
// Read and verify content
FSDataInputStream in = fs.open(testFile, 4096);
assertNotNull("Input stream should not be null", in);
byte[] buffer = new byte[testContent.length()];
int bytesRead = in.read(buffer);
in.close();
assertEquals("Should read all bytes", testContent.length(), bytesRead);
assertEquals("Content should match", testContent, new String(buffer));
}
@Test
public void testFileStatus() throws Exception {
if (!TESTS_ENABLED) {
System.out.println("Skipping test - SEAWEEDFS_TEST_ENABLED not set");
return;
}
Path testFile = new Path(TEST_ROOT + "/statustest.txt");
String content = "test content";
FSDataOutputStream out = fs.create(testFile);
out.write(content.getBytes());
out.close();
FileStatus status = fs.getFileStatus(testFile);
assertNotNull("FileStatus should not be null", status);
assertFalse("Should not be a directory", status.isDirectory());
assertTrue("Should be a file", status.isFile());
assertEquals("File length should match", content.length(), status.getLen());
assertNotNull("Path should not be null", status.getPath());
}
@Test
public void testListStatus() throws Exception {
if (!TESTS_ENABLED) {
System.out.println("Skipping test - SEAWEEDFS_TEST_ENABLED not set");
return;
}
Path testDir = new Path(TEST_ROOT + "/listtest");
fs.mkdirs(testDir);
// Create multiple files
for (int i = 0; i < 3; i++) {
Path file = new Path(testDir, "file" + i + ".txt");
FSDataOutputStream out = fs.create(file);
out.write(("content" + i).getBytes());
out.close();
}
FileStatus[] statuses = fs.listStatus(testDir);
assertNotNull("List should not be null", statuses);
assertEquals("Should have 3 files", 3, statuses.length);
}
@Test
public void testRename() throws Exception {
if (!TESTS_ENABLED) {
System.out.println("Skipping test - SEAWEEDFS_TEST_ENABLED not set");
return;
}
Path srcFile = new Path(TEST_ROOT + "/source.txt");
Path dstFile = new Path(TEST_ROOT + "/destination.txt");
String content = "rename test";
// Create source file
FSDataOutputStream out = fs.create(srcFile);
out.write(content.getBytes());
out.close();
assertTrue("Source file should exist", fs.exists(srcFile));
// Rename
assertTrue("Rename should succeed", fs.rename(srcFile, dstFile));
// Verify
assertFalse("Source file should not exist", fs.exists(srcFile));
assertTrue("Destination file should exist", fs.exists(dstFile));
// Verify content preserved
FSDataInputStream in = fs.open(dstFile);
byte[] buffer = new byte[content.length()];
in.read(buffer);
in.close();
assertEquals("Content should be preserved", content, new String(buffer));
}
@Test
public void testDelete() throws Exception {
if (!TESTS_ENABLED) {
System.out.println("Skipping test - SEAWEEDFS_TEST_ENABLED not set");
return;
}
Path testFile = new Path(TEST_ROOT + "/deletetest.txt");
// Create file
FSDataOutputStream out = fs.create(testFile);
out.write("delete me".getBytes());
out.close();
assertTrue("File should exist before delete", fs.exists(testFile));
// Delete
assertTrue("Delete should succeed", fs.delete(testFile, false));
assertFalse("File should not exist after delete", fs.exists(testFile));
}
@Test
public void testDeleteDirectory() throws Exception {
if (!TESTS_ENABLED) {
System.out.println("Skipping test - SEAWEEDFS_TEST_ENABLED not set");
return;
}
Path testDir = new Path(TEST_ROOT + "/deletedir");
Path testFile = new Path(testDir, "file.txt");
// Create directory with file
fs.mkdirs(testDir);
FSDataOutputStream out = fs.create(testFile);
out.write("content".getBytes());
out.close();
assertTrue("Directory should exist", fs.exists(testDir));
assertTrue("File should exist", fs.exists(testFile));
// Recursive delete
assertTrue("Recursive delete should succeed", fs.delete(testDir, true));
assertFalse("Directory should not exist after delete", fs.exists(testDir));
assertFalse("File should not exist after delete", fs.exists(testFile));
}
@Test
public void testAppend() throws Exception {
if (!TESTS_ENABLED) {
System.out.println("Skipping test - SEAWEEDFS_TEST_ENABLED not set");
return;
}
Path testFile = new Path(TEST_ROOT + "/appendtest.txt");
String initialContent = "initial";
String appendContent = " appended";
// Create initial file
FSDataOutputStream out = fs.create(testFile);
out.write(initialContent.getBytes());
out.close();
// Append
FSDataOutputStream appendOut = fs.append(testFile, 4096, null);
assertNotNull("Append stream should not be null", appendOut);
appendOut.write(appendContent.getBytes());
appendOut.close();
// Verify combined content
FSDataInputStream in = fs.open(testFile);
byte[] buffer = new byte[initialContent.length() + appendContent.length()];
int bytesRead = in.read(buffer);
in.close();
String expected = initialContent + appendContent;
assertEquals("Should read all bytes", expected.length(), bytesRead);
assertEquals("Content should match", expected, new String(buffer));
}
@Test
public void testSetWorkingDirectory() throws Exception {
if (!TESTS_ENABLED) {
System.out.println("Skipping test - SEAWEEDFS_TEST_ENABLED not set");
return;
}
Path originalWd = fs.getWorkingDirectory();
assertEquals("Original working directory should be /", "/", originalWd.toUri().getPath());
Path newWd = new Path(TEST_ROOT);
fs.mkdirs(newWd);
fs.setWorkingDirectory(newWd);
Path currentWd = fs.getWorkingDirectory();
assertTrue("Working directory should be updated",
currentWd.toUri().getPath().contains(TEST_ROOT));
}
@Test
public void testSetPermission() throws Exception {
if (!TESTS_ENABLED) {
System.out.println("Skipping test - SEAWEEDFS_TEST_ENABLED not set");
return;
}
Path testFile = new Path(TEST_ROOT + "/permtest.txt");
// Create file
FSDataOutputStream out = fs.create(testFile);
out.write("permission test".getBytes());
out.close();
// Set permission
FsPermission newPerm = new FsPermission((short) 0644);
fs.setPermission(testFile, newPerm);
FileStatus status = fs.getFileStatus(testFile);
assertNotNull("Permission should not be null", status.getPermission());
}
@Test
public void testSetOwner() throws Exception {
if (!TESTS_ENABLED) {
System.out.println("Skipping test - SEAWEEDFS_TEST_ENABLED not set");
return;
}
Path testFile = new Path(TEST_ROOT + "/ownertest.txt");
// Create file
FSDataOutputStream out = fs.create(testFile);
out.write("owner test".getBytes());
out.close();
// Set owner - this may not fail even if not fully implemented
fs.setOwner(testFile, "testuser", "testgroup");
// Just verify the call doesn't throw an exception
FileStatus status = fs.getFileStatus(testFile);
assertNotNull("FileStatus should not be null", status);
}
@Test
public void testRenameToExistingDirectory() throws Exception {
if (!TESTS_ENABLED) {
System.out.println("Skipping test - SEAWEEDFS_TEST_ENABLED not set");
return;
}
Path srcFile = new Path(TEST_ROOT + "/movefile.txt");
Path dstDir = new Path(TEST_ROOT + "/movedir");
// Create source file and destination directory
FSDataOutputStream out = fs.create(srcFile);
out.write("move test".getBytes());
out.close();
fs.mkdirs(dstDir);
// Rename file to existing directory (should move file into directory)
assertTrue("Rename to directory should succeed", fs.rename(srcFile, dstDir));
// File should be moved into the directory
Path expectedLocation = new Path(dstDir, srcFile.getName());
assertTrue("File should exist in destination directory", fs.exists(expectedLocation));
assertFalse("Source file should not exist", fs.exists(srcFile));
}
}

109
other/java/hdfs3/src/main/java/seaweed/hdfs/SeaweedAtomicOutputStream.java

@ -1,109 +0,0 @@
package seaweed.hdfs;
import org.apache.hadoop.fs.Syncable;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import seaweedfs.client.FilerClient;
import seaweedfs.client.FilerProto;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
/**
* Atomic output stream for Parquet files.
*
* Buffers all writes in memory and writes atomically on close().
* This ensures that getPos() always returns accurate positions that match
* the final file layout, which is required for Parquet's footer metadata.
*/
public class SeaweedAtomicOutputStream extends SeaweedHadoopOutputStream implements Syncable {
private static final Logger LOG = LoggerFactory.getLogger(SeaweedAtomicOutputStream.class);
private final ByteArrayOutputStream memoryBuffer;
private final String filePath;
private boolean closed = false;
public SeaweedAtomicOutputStream(FilerClient filerClient, String path, FilerProto.Entry.Builder entry,
long position, int maxBufferSize, String replication) {
super(filerClient, path, entry, position, maxBufferSize, replication);
this.filePath = path;
this.memoryBuffer = new ByteArrayOutputStream(maxBufferSize);
LOG.info("[ATOMIC] Created atomic output stream for: {} (maxBuffer={})", path, maxBufferSize);
}
@Override
public synchronized void write(int b) throws IOException {
if (closed) {
throw new IOException("Stream is closed");
}
memoryBuffer.write(b);
}
@Override
public synchronized void write(byte[] b, int off, int len) throws IOException {
if (closed) {
throw new IOException("Stream is closed");
}
memoryBuffer.write(b, off, len);
}
@Override
public synchronized long getPos() throws IOException {
// Return the current size of the memory buffer
// This is always accurate since nothing is flushed until close()
long pos = memoryBuffer.size();
// Log getPos() calls around the problematic positions
if (pos >= 470 && pos <= 476) {
LOG.error("[ATOMIC-GETPOS] getPos() returning pos={}", pos);
}
return pos;
}
@Override
public synchronized void flush() throws IOException {
// No-op for atomic writes - everything is flushed on close()
LOG.debug("[ATOMIC] flush() called (no-op for atomic writes)");
}
@Override
public synchronized void hsync() throws IOException {
// No-op for atomic writes
LOG.debug("[ATOMIC] hsync() called (no-op for atomic writes)");
}
@Override
public synchronized void hflush() throws IOException {
// No-op for atomic writes
LOG.debug("[ATOMIC] hflush() called (no-op for atomic writes)");
}
@Override
public synchronized void close() throws IOException {
if (closed) {
return;
}
try {
byte[] data = memoryBuffer.toByteArray();
int size = data.length;
LOG.info("[ATOMIC] Closing atomic stream: {} ({} bytes buffered)", filePath, size);
if (size > 0) {
// Write all data at once using the parent's write method
super.write(data, 0, size);
}
// Now close the parent stream which will flush and write metadata
super.close();
LOG.info("[ATOMIC] Successfully wrote {} bytes atomically to: {}", size, filePath);
} finally {
closed = true;
memoryBuffer.reset();
}
}
}

132
test/java/spark/COMMIT_SUMMARY.md

@ -1,132 +0,0 @@
# Fix Parquet EOF Error by Removing ByteBufferReadable Interface
## Summary
Fixed `EOFException: Reached the end of stream. Still have: 78 bytes left` error when reading Parquet files with complex schemas in Spark.
## Root Cause
`SeaweedHadoopInputStream` declared it implemented `ByteBufferReadable` interface but didn't properly implement it, causing incorrect buffering strategy and position tracking issues during positioned reads (critical for Parquet).
## Solution
Removed `ByteBufferReadable` interface from `SeaweedHadoopInputStream` to match Hadoop's `RawLocalFileSystem` pattern, which uses `BufferedFSInputStream` for proper position tracking.
## Changes
### Core Fix
1. **`SeaweedHadoopInputStream.java`**:
- Removed `ByteBufferReadable` interface
- Removed `read(ByteBuffer)` method
- Cleaned up debug logging
- Added documentation explaining the design choice
2. **`SeaweedFileSystem.java`**:
- Changed from `BufferedByteBufferReadableInputStream` to `BufferedFSInputStream`
- Applies to all streams uniformly
- Cleaned up debug logging
3. **`SeaweedInputStream.java`**:
- Cleaned up debug logging
### Cleanup
4. **Deleted debug-only files**:
- `DebugDualInputStream.java`
- `DebugDualInputStreamWrapper.java`
- `DebugDualOutputStream.java`
- `DebugMode.java`
- `LocalOnlyInputStream.java`
- `ShadowComparisonStream.java`
5. **Reverted**:
- `SeaweedFileSystemStore.java` (removed all debug mode logic)
6. **Cleaned**:
- `docker-compose.yml` (removed debug environment variables)
- All `.md` documentation files in `test/java/spark/`
## Testing
All Spark integration tests pass:
- ✅ `SparkSQLTest.testCreateTableAndQuery` (complex 4-column schema)
- ✅ `SimpleOneColumnTest` (basic operations)
- ✅ All other Spark integration tests
## Technical Details
### Why This Works
Hadoop's `RawLocalFileSystem` uses the exact same pattern:
- Does NOT implement `ByteBufferReadable`
- Uses `BufferedFSInputStream` for buffering
- Properly handles positioned reads with automatic position restoration
### Position Tracking
`BufferedFSInputStream` implements positioned reads correctly:
```java
public int read(long position, byte[] buffer, int offset, int length) {
long oldPos = getPos();
try {
seek(position);
return read(buffer, offset, length);
} finally {
seek(oldPos); // Restores position!
}
}
```
This ensures buffered reads don't permanently change the stream position, which is critical for Parquet's random access pattern.
### Performance Impact
Minimal to none:
- Network latency dominates for remote storage
- Buffering is still active (4x buffer size)
- Extra byte[] copy is negligible compared to network I/O
## Commit Message
```
Fix Parquet EOF error by removing ByteBufferReadable interface
SeaweedHadoopInputStream incorrectly declared ByteBufferReadable interface
without proper implementation, causing position tracking issues during
positioned reads. This resulted in "78 bytes left" EOF errors when reading
Parquet files with complex schemas in Spark.
Solution: Remove ByteBufferReadable and use BufferedFSInputStream (matching
Hadoop's RawLocalFileSystem pattern) which properly handles position
restoration for positioned reads.
Changes:
- Remove ByteBufferReadable interface from SeaweedHadoopInputStream
- Change SeaweedFileSystem to use BufferedFSInputStream for all streams
- Clean up debug logging
- Delete debug-only classes and files
Tested: All Spark integration tests pass
```
## Files Changed
### Modified
- `other/java/hdfs3/src/main/java/seaweed/hdfs/SeaweedHadoopInputStream.java`
- `other/java/hdfs3/src/main/java/seaweed/hdfs/SeaweedFileSystem.java`
- `other/java/client/src/main/java/seaweedfs/client/SeaweedInputStream.java`
- `test/java/spark/docker-compose.yml`
### Reverted
- `other/java/hdfs3/src/main/java/seaweed/hdfs/SeaweedFileSystemStore.java`
### Deleted
- `other/java/hdfs3/src/main/java/seaweed/hdfs/DebugDualInputStream.java`
- `other/java/hdfs3/src/main/java/seaweed/hdfs/DebugDualInputStreamWrapper.java`
- `other/java/hdfs3/src/main/java/seaweed/hdfs/DebugDualOutputStream.java`
- `other/java/hdfs3/src/main/java/seaweed/hdfs/DebugMode.java`
- `other/java/hdfs3/src/main/java/seaweed/hdfs/LocalOnlyInputStream.java`
- `other/java/hdfs3/src/main/java/seaweed/hdfs/ShadowComparisonStream.java`
- All `.md` files in `test/java/spark/` (debug documentation)
Loading…
Cancel
Save