From 5e2cfd1eefa2ffffa13c33176ebb4e504dc66c62 Mon Sep 17 00:00:00 2001 From: Chris Lu Date: Sat, 7 Mar 2026 19:19:45 -0800 Subject: [PATCH] docs: add rust volume parity plan --- seaweed-volume/PARITY_PLAN.md | 202 ++++++++++++++++++++++++++++++++++ 1 file changed, 202 insertions(+) create mode 100644 seaweed-volume/PARITY_PLAN.md diff --git a/seaweed-volume/PARITY_PLAN.md b/seaweed-volume/PARITY_PLAN.md new file mode 100644 index 000000000..919bd66e0 --- /dev/null +++ b/seaweed-volume/PARITY_PLAN.md @@ -0,0 +1,202 @@ +# Rust Volume Server Parity Plan + +Generated: 2026-03-08 + +## Goal + +Make `seaweed-volume` a drop-in replacement for the Go volume server by: + +- comparing every Go volume-server code path against the Rust implementation, +- recording file-level ownership and verification status, +- closing verified behavior gaps one logic change per commit, +- extending tests so regressions are caught by Go parity suites and Rust unit/integration tests. + +## Ground Truth + +Primary Go sources: + +- `weed/server/volume_server.go` +- `weed/server/volume_server_handlers*.go` +- `weed/server/volume_grpc_*.go` +- `weed/server/constants/volume.go` +- `weed/storage/store*.go` +- `weed/storage/disk_location*.go` +- `weed/storage/volume*.go` +- `weed/storage/needle/*.go` +- `weed/storage/idx/*.go` +- `weed/storage/needle_map*.go` +- `weed/storage/needle_map/*.go` +- `weed/storage/super_block/*.go` +- `weed/storage/erasure_coding/*.go` + +Supporting Go dependencies that affect drop-in behavior: + +- `weed/command/volume.go` +- `weed/security/*.go` +- `weed/images/*.go` +- `weed/stats/*.go` + +Primary Rust sources: + +- `seaweed-volume/src/main.rs` +- `seaweed-volume/src/config.rs` +- `seaweed-volume/src/security.rs` +- `seaweed-volume/src/images.rs` +- `seaweed-volume/src/server/*.rs` +- `seaweed-volume/src/storage/*.rs` +- `seaweed-volume/src/storage/needle/*.rs` +- `seaweed-volume/src/storage/idx/*.rs` +- `seaweed-volume/src/storage/erasure_coding/*.rs` +- `seaweed-volume/src/remote_storage/*.rs` + +## Audit Method + +For each Go file: + +1. Map it to the Rust file or files that should own the same behavior. +2. Compare exported entry points, helper functions, state transitions, wire fields, and persistence side effects. +3. Mark each file `implemented`, `partial`, `missing`, or `needs verification`. +4. Link each behavior to an existing test or add a missing test. +5. Only treat a gap as closed after code review plus local verification. + +## Acceptance Criteria + +The Rust server is a drop-in replacement only when all of these hold: + +- HTTP routes, status codes, headers, and body semantics match Go. +- gRPC RPCs match Go request validation, response fields, streaming behavior, and maintenance/read-only semantics. +- Master heartbeat and topology metadata match Go closely enough that the Go master treats Rust and Go volume servers the same. +- On-disk volume behavior matches Go for normal volumes, EC shards, tiering metadata, and readonly persistence. +- Startup flags and operational endpoints that affect production deployment behave equivalently or are explicitly documented as unsupported. +- Existing Go integration suites pass with `VOLUME_SERVER_IMPL=rust`. + +## File Matrix + +### HTTP server surface + +| Go file | Rust counterpart | Status | Comparison focus | +| --- | --- | --- | --- | +| `weed/server/volume_server.go` | `seaweed-volume/src/main.rs`, `seaweed-volume/src/server/volume_server.rs`, `seaweed-volume/src/server/heartbeat.rs` | partial | startup wiring, routers, heartbeat, shutdown, metrics/debug listeners | +| `weed/server/volume_server_handlers.go` | `seaweed-volume/src/server/volume_server.rs`, `seaweed-volume/src/server/handlers.rs` | needs verification | method dispatch, OPTIONS behavior, public/admin split | +| `weed/server/volume_server_handlers_admin.go` | `seaweed-volume/src/server/handlers.rs` | implemented | `/status`, `/healthz`, stats, server headers | +| `weed/server/volume_server_handlers_helper.go` | `seaweed-volume/src/server/handlers.rs` | needs verification | JSON encoding, request parsing, helper parity | +| `weed/server/volume_server_handlers_read.go` | `seaweed-volume/src/server/handlers.rs` | needs verification | JWT, conditional reads, range reads, proxy/redirect, chunk manifests, image transforms | +| `weed/server/volume_server_handlers_ui.go` | `seaweed-volume/src/server/handlers.rs`, embedded assets | partial | UI payload and HTML parity | +| `weed/server/volume_server_handlers_write.go` | `seaweed-volume/src/server/handlers.rs`, `seaweed-volume/src/images.rs` | needs verification | multipart parsing, metadata, compression, ts, delete semantics | +| `weed/server/constants/volume.go` | `seaweed-volume/src/server/heartbeat.rs`, config defaults | needs verification | heartbeat timing, constants parity | + +### gRPC server surface + +| Go file | Rust counterpart | Status | Comparison focus | +| --- | --- | --- | --- | +| `weed/server/volume_grpc_admin.go` | `seaweed-volume/src/server/grpc_server.rs` | needs verification | readonly/writable, allocate/delete/configure/mount/unmount | +| `weed/server/volume_grpc_batch_delete.go` | `seaweed-volume/src/server/grpc_server.rs` | implemented | batch delete, EC delete path | +| `weed/server/volume_grpc_client_to_master.go` | `seaweed-volume/src/server/heartbeat.rs` | partial | heartbeat fields, leader changes, metrics settings from master | +| `weed/server/volume_grpc_copy.go` | `seaweed-volume/src/server/grpc_server.rs` | needs verification | full copy streams | +| `weed/server/volume_grpc_copy_incremental.go` | `seaweed-volume/src/server/grpc_server.rs` | needs verification | incremental copy binary search, timestamps | +| `weed/server/volume_grpc_erasure_coding.go` | `seaweed-volume/src/server/grpc_server.rs`, `seaweed-volume/src/storage/erasure_coding/*.rs` | needs verification | shard read/write/delete/mount/unmount/rebuild | +| `weed/server/volume_grpc_query.go` | `seaweed-volume/src/server/grpc_server.rs` | needs verification | query validation and error parity | +| `weed/server/volume_grpc_read_all.go` | `seaweed-volume/src/server/grpc_server.rs` | needs verification | read-all ordering and tail semantics | +| `weed/server/volume_grpc_read_write.go` | `seaweed-volume/src/server/grpc_server.rs`, `seaweed-volume/src/storage/*.rs` | needs verification | blob/meta/page reads, write blob semantics | +| `weed/server/volume_grpc_remote.go` | `seaweed-volume/src/server/grpc_server.rs`, `seaweed-volume/src/remote_storage/*.rs` | needs verification | remote fetch/write and tier metadata | +| `weed/server/volume_grpc_scrub.go` | `seaweed-volume/src/server/grpc_server.rs`, `seaweed-volume/src/storage/*.rs` | needs verification | scrub result semantics | +| `weed/server/volume_grpc_state.go` | `seaweed-volume/src/server/grpc_server.rs` | implemented | GetState/SetState/Status | +| `weed/server/volume_grpc_tail.go` | `seaweed-volume/src/server/grpc_server.rs` | needs verification | tail streaming and idle timeout | +| `weed/server/volume_grpc_tier_download.go` | `seaweed-volume/src/server/grpc_server.rs`, `seaweed-volume/src/remote_storage/*.rs` | needs verification | tier download stream/error paths | +| `weed/server/volume_grpc_tier_upload.go` | `seaweed-volume/src/server/grpc_server.rs`, `seaweed-volume/src/remote_storage/*.rs` | needs verification | tier upload stream/error paths | +| `weed/server/volume_grpc_vacuum.go` | `seaweed-volume/src/server/grpc_server.rs`, `seaweed-volume/src/storage/*.rs` | needs verification | compact/commit/cleanup progress and readonly transitions | + +### Storage and persistence surface + +| Go file group | Rust counterpart | Status | Comparison focus | +| --- | --- | --- | --- | +| `weed/storage/store.go`, `store_state.go` | `seaweed-volume/src/storage/store.rs`, `seaweed-volume/src/server/heartbeat.rs` | partial | topology metadata, disk tags, server id, state persistence | +| `weed/storage/store_vacuum.go` | `seaweed-volume/src/storage/store.rs`, `seaweed-volume/src/storage/volume.rs` | needs verification | vacuum sequencing | +| `weed/storage/store_ec.go`, `store_ec_delete.go`, `store_ec_scrub.go` | `seaweed-volume/src/storage/store.rs`, `seaweed-volume/src/storage/erasure_coding/*.rs` | needs verification | EC lifecycle and scrub behavior | +| `weed/storage/disk_location.go`, `disk_location_ec.go` | `seaweed-volume/src/storage/disk_location.rs`, `seaweed-volume/src/storage/store.rs` | partial | directory UUIDs, tags, load rules, disk space checks | +| `weed/storage/volume.go`, `volume_loading.go` | `seaweed-volume/src/storage/volume.rs` | needs verification | load/reload/readonly/remote metadata | +| `weed/storage/volume_super_block.go` | `seaweed-volume/src/storage/super_block.rs`, `seaweed-volume/src/storage/volume.rs` | implemented | super block parity | +| `weed/storage/volume_read.go`, `volume_read_all.go` | `seaweed-volume/src/storage/volume.rs`, `seaweed-volume/src/server/handlers.rs` | needs verification | full/meta/page reads, TTL, streaming | +| `weed/storage/volume_write.go` | `seaweed-volume/src/storage/volume.rs`, `seaweed-volume/src/server/write_queue.rs` | needs verification | dedup, sync/async writes, metadata flags | +| `weed/storage/volume_vacuum.go` | `seaweed-volume/src/storage/volume.rs` | needs verification | compact and commit parity | +| `weed/storage/volume_backup.go` | `seaweed-volume/src/storage/volume.rs`, `seaweed-volume/src/server/grpc_server.rs` | needs verification | backup/search logic | +| `weed/storage/volume_checking.go` | `seaweed-volume/src/storage/volume.rs`, `seaweed-volume/src/storage/idx/mod.rs`, `seaweed-volume/src/server/grpc_server.rs` | needs verification | scrub and integrity checks | +| `weed/storage/volume_info.go`, `volume_info/volume_info.go`, `volume_tier.go` | `seaweed-volume/src/storage/volume.rs`, `seaweed-volume/src/remote_storage/*.rs` | needs verification | `.vif` format and tiered file metadata | +| `weed/storage/needle/*.go` | `seaweed-volume/src/storage/needle/*.rs` | needs verification | needle parsing, CRC, TTL, multipart metadata | +| `weed/storage/idx/*.go` | `seaweed-volume/src/storage/idx/*.rs` | needs verification | index walking and binary search | +| `weed/storage/needle_map*.go`, `needle_map/*.go` | `seaweed-volume/src/storage/needle_map.rs` | needs verification | map kind parity, persistence, memory behavior | +| `weed/storage/super_block/*.go` | `seaweed-volume/src/storage/super_block.rs` | implemented | replica placement and TTL metadata | +| `weed/storage/erasure_coding/*.go` | `seaweed-volume/src/storage/erasure_coding/*.rs` | needs verification | EC shard placement, encode/decode, journal deletes | + +### Supporting runtime surface + +| Go file | Rust counterpart | Status | Comparison focus | +| --- | --- | --- | --- | +| `weed/command/volume.go` | `seaweed-volume/src/config.rs`, `seaweed-volume/src/main.rs` | partial | flags, metrics/debug listeners, startup behavior | +| `weed/security/*.go` | `seaweed-volume/src/security.rs`, `seaweed-volume/src/main.rs` | implemented | JWT and TLS loading | +| `weed/images/*.go` | `seaweed-volume/src/images.rs`, `seaweed-volume/src/server/handlers.rs` | implemented | JPEG orientation and transforms | +| `weed/stats/*.go` | `seaweed-volume/src/metrics.rs`, `seaweed-volume/src/server/handlers.rs` | partial | metrics endpoints, push-gateway integration | + +## Verified Gaps As Of 2026-03-08 + +These are real code gaps found in the current Rust source, not stale items from older docs. + +1. Heartbeat metadata parity + Go sends stable server `Id`, `LocationUuids`, per-disk `DiskTags`, and correct `HasNoVolumes`. + Rust currently omits those fields, even though `--id` and `--tags` are parsed. + Risk: Go master may not treat Rust volume servers identically for identity, placement, and duplicate-directory detection. + +2. Dedicated metrics/debug listener parity + Go honors `--metricsPort`, `--metricsIp`, `--pprof`, and `--debug`. + Rust parses these flags but currently serves metrics only on the admin router and does not expose equivalent debug listeners. + Risk: deployment scripts that rely on dedicated metrics or profiling ports will not behave the same. + +3. Master-provided metrics push settings + Go updates `metricsAddress` and `metricsIntervalSec` from master heartbeat responses and pushes metrics. + Rust does not consume those fields. + Risk: Prometheus push-gateway setups configured via master are silently ignored. + +4. Slow-read tuning parity + Go uses `hasSlowRead` and `readBufferSizeMB` in read paths and storage locking behavior. + Rust accepts both flags but the values are not currently wired into the read implementation. + Risk: production latency and lock behavior can diverge under large-file read pressure. + +## Execution Plan + +### Batch 1: startup and heartbeat + +- Compare `weed/command/volume.go`, `weed/server/volume_server.go`, `weed/server/volume_grpc_client_to_master.go`, `weed/storage/store.go`, and `weed/storage/disk_location.go`. +- Close metadata and startup parity gaps that affect master registration and deployment compatibility. +- Add Rust unit tests for heartbeat payloads and config wiring. + +### Batch 2: HTTP read path + +- Compare `volume_server_handlers_read.go`, `volume_server_handlers_helper.go`, and related storage read functions line by line. +- Verify JWT, path parsing, proxy/redirect, ranges, streaming, chunk manifests, image transforms, and response-header overrides. +- Extend `test/volume_server/http/...` and Rust handler tests where parity is not covered. + +### Batch 3: HTTP write/delete path + +- Compare `volume_server_handlers_write.go` and write-related storage functions. +- Verify multipart behavior, metadata, md5, compression, unchanged writes, delete edge cases, and timestamp handling. + +### Batch 4: gRPC admin and lifecycle + +- Compare `volume_grpc_admin.go`, `volume_grpc_state.go`, and `volume_grpc_vacuum.go`. +- Verify readonly/writable flows, maintenance mode, status payloads, mount/unmount/delete/configure, and vacuum transitions. + +### Batch 5: gRPC data movement + +- Compare `volume_grpc_read_write.go`, `copy*.go`, `read_all.go`, `tail.go`, `remote.go`, and `query.go`. +- Verify stream framing, binary search, idle timeout, and remote-storage semantics. + +### Batch 6: storage internals + +- Compare all `weed/storage` volume, needle, idx, needle map, and EC files line by line. +- Focus on persistence rules, readonly semantics, TTL, recovery/scrub, backup, and memory/disk map behavior. + +## Commit Strategy + +- One commit for the audit/plan document if the document itself changes. +- One commit per logic fix. +- Every logic commit must include the smallest test addition that proves the new parity claim.