From 14c863dbffc02835b1ba5276c1e1b341787f0693 Mon Sep 17 00:00:00 2001 From: Chris Lu Date: Mon, 16 Feb 2026 00:09:31 -0800 Subject: [PATCH] docs(volume-server): refocus plan on native rust parity --- rust/volume_server/DEV_PLAN.md | 183 +++++++++++++++++++++++---------- test/volume_server/DEV_PLAN.md | 13 +++ 2 files changed, 142 insertions(+), 54 deletions(-) diff --git a/rust/volume_server/DEV_PLAN.md b/rust/volume_server/DEV_PLAN.md index de695f05c..7bb5d85a9 100644 --- a/rust/volume_server/DEV_PLAN.md +++ b/rust/volume_server/DEV_PLAN.md @@ -1,72 +1,147 @@ -# Rust Volume Server Rewrite Dev Plan +# Rust Volume Server Parity Implementation Plan -## Goal -Build a Rust implementation of SeaweedFS volume server that is behavior-compatible with the current Go implementation and can pass the existing integration suites under `/Users/chris/dev/seaweedfs2/test/volume_server/http` and `/Users/chris/dev/seaweedfs2/test/volume_server/grpc`. +## Objective +Implement a native Rust volume server that replicates Go volume-server behavior for HTTP and gRPC APIs, so it can become a drop-in replacement validated by existing integration suites. -## Compatibility Target -- CLI compatibility for volume-server startup flags used by integration harness. -- HTTP and gRPC behavioral parity for tested paths. -- Drop-in process integration with current Go master in transition phases. +## Current Focus (2026-02-16) +- Program focus is now Rust implementation parity, not broad test expansion. +- `test/volume_server` is treated as the parity gate. +- Existing Rust launcher modes (`exec`, `proxy`) are transition tools; they are not the final target. -## Phases - -### Phase 0: Bootstrap and Harness Integration -- [x] Add Rust volume-server crate. -- [x] Implement Rust launcher that can run as a volume-server process entrypoint. -- [x] Add launcher execution modes (`exec` and `proxy`) behind `VOLUME_SERVER_RUST_MODE`. -- [x] Add integration harness switches so tests can run with: +## Current Status +- Rust crate and launcher are in place. +- Integration harness can run: - Go master + Go volume (default) - - Go master + Rust volume (`VOLUME_SERVER_IMPL=rust` or `VOLUME_SERVER_BINARY=...`) -- [x] Add CI smoke coverage for Rust volume-server mode. + - Go master + Rust launcher (`VOLUME_SERVER_IMPL=rust`) +- Rust launcher `proxy` mode has full-suite integration pass while delegating backend handlers to Go. +- Native Rust API/storage logic is not implemented yet. + +## Parity Exit Criteria +1. Native mode passes: + - `env VOLUME_SERVER_IMPL=rust VOLUME_SERVER_RUST_MODE=native go test -count=1 ./test/volume_server/http` + - `env VOLUME_SERVER_IMPL=rust VOLUME_SERVER_RUST_MODE=native go test -count=1 ./test/volume_server/grpc` +2. CI runs native Rust mode integration coverage (at least smoke, then expanded shards). +3. Rust mode defaults to native behavior for integration harness. +4. Go-backend delegation is removed (or retained only as explicit fallback mode). + +## Architecture Workstreams -### Phase 1: Native Rust Control Plane Skeleton -- [ ] Native Rust HTTP server with admin endpoints: +### A. Runtime and Configuration Parity +- [ ] Add `native` runtime mode in `weed-volume-rs`. +- [ ] Parse and honor volume-server CLI/config flags used by integration harness: + - [ ] network/bind ports (`-ip`, `-port`, `-port.grpc`, `-port.public`) + - [ ] master target/config dir/read mode/throttling/JWT-related config + - [ ] size/timeout controls and maintenance state defaults +- [ ] Implement graceful lifecycle behavior (signals, shutdown, readiness). + +### B. Native HTTP Surface +- [ ] Admin/control endpoints: - [ ] `GET /status` - [ ] `GET /healthz` - - [ ] static/UI endpoints used by tests -- [ ] Native Rust gRPC server with basic lifecycle/state RPCs: + - [ ] static/UI endpoints currently exercised +- [ ] Data read path parity: + - [ ] fid parsing/path variants + - [ ] conditional headers (`If-Modified-Since`, `If-None-Match`) + - [ ] range handling (single/multi/invalid) + - [ ] deleted reads, auth checks, read-mode branches + - [ ] chunk-manifest and compression/image transformation branches +- [ ] Data write/delete parity: + - [ ] write success/unchanged/error paths + - [ ] replication and file-size-limit paths + - [ ] delete and chunk-manifest delete branches +- [ ] Method/CORS/public-port parity for split admin/public behavior. + +### C. Native gRPC Surface +- [ ] Control-plane RPCs: - [ ] `GetState`, `SetState`, `VolumeServerStatus`, `Ping`, `VolumeServerLeave` -- [ ] Flag/config parser parity for currently exercised startup options. - -### Phase 2: Native Data Path (HTTP + core gRPC) -- [ ] HTTP read/write/delete parity: - - [ ] path variants, conditional headers, ranges, auth, throttling - - [ ] chunk manifest read/delete behavior - - [ ] image and compression transform branches -- [ ] gRPC data RPC parity: + - [ ] admin lifecycle: allocate/mount/unmount/delete/configure/readonly/writable +- [ ] Data RPCs: - [ ] `ReadNeedleBlob`, `ReadNeedleMeta`, `WriteNeedleBlob` - [ ] `BatchDelete`, `ReadAllNeedles` - - [ ] copy/receive/sync baseline - -### Phase 3: Advanced gRPC Surface -- [ ] Vacuum RPC family. -- [ ] Tail sender/receiver. -- [ ] Erasure coding family. -- [ ] Tiering/remote fetch family. -- [ ] Query/Scrub family. - -### Phase 4: Hardening and Cutover -- [ ] Determinism/flake hardening in integration runtime. -- [ ] Performance and resource-baseline checks versus Go. -- [ ] Optional dual-run diff tooling for payload/header parity. -- [ ] Default harness/CI mode switch to Rust volume server once parity threshold is met. - -## Integration Test Mapping -- HTTP suite: `/Users/chris/dev/seaweedfs2/test/volume_server/http` -- gRPC suite: `/Users/chris/dev/seaweedfs2/test/volume_server/grpc` -- Harness: `/Users/chris/dev/seaweedfs2/test/volume_server/framework` + - [ ] sync/copy/receive and status endpoints +- [ ] Stream RPCs: + - [ ] tail sender/receiver + - [ ] vacuum streams + - [ ] query streams +- [ ] Advanced families: + - [ ] erasure coding RPC set + - [ ] tiering/remote fetch + - [ ] scrub/query mode matrix + +### D. Storage Compatibility Layer +- [ ] Implement volume data/index handling compatible with Go on-disk format. +- [ ] Preserve cookie/checksum/timestamp semantics used by tests. +- [ ] Match read/write/delete consistency and error mapping behavior. +- [ ] Ensure EC metadata/data-path compatibility with existing files. + +### E. Operational Hardening +- [ ] Deterministic startup/readiness and shutdown semantics. +- [ ] Log/error parity sufficient for debugging and CI triage. +- [ ] Concurrency/timeout behavior alignment for throttling and streams. +- [ ] Performance baseline checks vs Go for key flows. + +## Milestone Plan + +### M0 (Completed): Harness + Launcher Transition +- [x] Rust launcher integrated into harness. +- [x] Proxy mode full-suite validation with Go backend delegation. + +### M1: Native Skeleton (Control Plane First) +- [ ] `native` mode boots and serves: + - [ ] `/status`, `/healthz` + - [ ] `GetState`, `SetState`, `VolumeServerStatus`, `Ping`, `VolumeServerLeave` +- Gate: + - targeted HTTP/grpc control tests pass in `native` mode. + +### M2: Native Core Data Paths +- [ ] Native HTTP read/write/delete baseline parity. +- [ ] Native gRPC data baseline parity (`Read/WriteNeedle*`, `BatchDelete`, `ReadAllNeedles`). +- Gate: + - core HTTP and gRPC data suites pass in `native` mode. + +### M3: Native Stream + Copy/Sync +- [ ] Tail/copy/receive/sync paths in native mode. +- Gate: + - stream/copy families pass in `native` mode. + +### M4: Native Advanced Feature Families +- [ ] EC, tiering, scrub/query advanced branches. +- Gate: + - full `/test/volume_server/http` and `/test/volume_server/grpc` pass in `native` mode. + +### M5: CI/Cutover +- [ ] Add/expand native-mode CI jobs. +- [ ] Make native mode default for Rust integration runs. +- [ ] Keep `exec`/`proxy` only as explicit fallback modes during rollout. + +## Immediate Next Steps +1. Introduce `VOLUME_SERVER_RUST_MODE=native` and wire native server startup skeleton. +2. Implement `/status` and `/healthz` with parity headers/payload fields. +3. Implement minimal gRPC state/ping RPCs. +4. Run targeted integration tests in native mode and iterate on mismatches. + +## Risk Register +- On-disk format mismatch risk: + - Mitigation: implement format-level compatibility tests early (idx/dat/needle encoding). +- Behavioral drift in edge branches: + - Mitigation: use integration suite failures as primary truth; only add tests for newly discovered untracked branches. +- Stream/concurrency semantic mismatch: + - Mitigation: stabilize with focused interruption/timeout parity tests. ## Progress Log - Date: 2026-02-15 -- Change: Created Rust volume-server crate (`weed-volume-rs`) as compatibility launcher and wired harness binary selection (`VOLUME_SERVER_IMPL`/`VOLUME_SERVER_BINARY`). -- Validation: local Rust-mode smoke and full-suite runs passed: - - `VOLUME_SERVER_IMPL=rust go test ./test/volume_server/http ./test/volume_server/grpc` +- Change: Added Rust launcher integration (`exec`) and harness wiring. +- Validation: Rust launcher mode passed smoke and full integration suites while delegating to Go backend. - Commits: `7beab85c2`, `880c2e1da`, `63d08e8a9`, `d402573ea`, `3bd20e6a1`, `6ce4d7ede` - Date: 2026-02-15 -- Change: Added Rust proxy supervisor mode (`VOLUME_SERVER_RUST_MODE=proxy`) with front-side TCP listeners for HTTP/public/gRPC and managed Go backend process. +- Change: Added Rust proxy supervisor mode and validated full integration suite. - Validation: - - `env VOLUME_SERVER_IMPL=rust VOLUME_SERVER_RUST_MODE=proxy go test -count=1 -timeout=200m ./test/volume_server/http` - - `env VOLUME_SERVER_IMPL=rust VOLUME_SERVER_RUST_MODE=proxy go test -count=1 -timeout=240m ./test/volume_server/grpc` - - Result: both suites pass end-to-end in proxy mode. + - `env VOLUME_SERVER_IMPL=rust VOLUME_SERVER_RUST_MODE=proxy go test -count=1 ./test/volume_server/http` + - `env VOLUME_SERVER_IMPL=rust VOLUME_SERVER_RUST_MODE=proxy go test -count=1 ./test/volume_server/grpc` - Commits: `a7f50d23b`, `548b3d9a3` + +- Date: 2026-02-16 +- Change: Re-focused plan from test expansion to native Rust implementation parity. +- Validation basis: latest Rust proxy full-suite pass keeps regression baseline stable while native implementation starts. +- Commits: pending diff --git a/test/volume_server/DEV_PLAN.md b/test/volume_server/DEV_PLAN.md index 4f42327a0..934065c7b 100644 --- a/test/volume_server/DEV_PLAN.md +++ b/test/volume_server/DEV_PLAN.md @@ -3,6 +3,12 @@ ## Goal Create a Go integration test suite under `test/volume_server` that validates **drop-in behavior parity** for the Volume Server HTTP and gRPC APIs, so a Rust rewrite can be verified against the current Go behavior. +## Current Program Focus (2026-02-16) +- Primary execution focus has shifted to implementing native Rust volume-server parity. +- This integration suite is now the parity gate for Rust implementation work. +- New tests should be added only when native Rust implementation reveals uncovered Go behavior that is not yet captured. +- Rust implementation roadmap lives in `/Users/chris/dev/seaweedfs2/rust/volume_server/DEV_PLAN.md`. + ## Hard Requirements - Tests live under `test/volume_server`. - Tests are written in Go. @@ -1260,3 +1266,10 @@ Update this section during implementation: - Profiles covered: P1. - Gaps introduced/remaining: deleted-read parity now covers both `GET` and `HEAD` semantics on local-volume path. - Commit: `cc80ad364` + +- Date: 2026-02-16 +- Change: Shifted planning priority to native Rust implementation parity. +- APIs covered: no new API additions in this plan entry; integration suite remains the validation gate. +- Profiles covered: unchanged. +- Gaps introduced/remaining: primary remaining gap is native Rust handler/storage/RPC implementation replacing Go backend delegation. +- Commit: pending