docs: ARCHITECTURE_COMPARISON.md as living source file
Per J's request: move the parallel-runtime comparison from reports/cutover/ (where it lived as cutover-prep evidence) into docs/ as the source-of-truth file. J will keep updating it as fixes ship on either side. Restructured for living-document use: - Status header (last refresh date, owner, update triggers) - 'How to update this doc' section with explicit dos and don'ts - Decisions tracker at top — actioned items with commit refs + open backlog with LOC estimates - Each comparison section now has 'Last verified' columns where numbers are time-sensitive - Change log section at bottom for one-line entries on every meaningful refresh The original at reports/cutover/architecture_comparison.md gains a 'THIS IS A SNAPSHOT' header pointing at the docs/ source. Kept as historical record but no longer the place to update. Sister pointer file in /home/profit/lakehouse/docs/ARCHITECTURE_COMPARISON.md so the doc is reachable from either repo side. That file explicitly says the source lives in golangLAKEHOUSE and warns against authoritative content in the pointer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
b03521a506
commit
2a974d6dea
321
docs/ARCHITECTURE_COMPARISON.md
Normal file
321
docs/ARCHITECTURE_COMPARISON.md
Normal file
@ -0,0 +1,321 @@
|
||||
# Lakehouse: Rust vs Go architecture comparison
|
||||
|
||||
> **Status**: Living document · primary source for the parallel-runtime
|
||||
> comparison.
|
||||
> **Owner**: J. Update this when either side ships a fix that changes
|
||||
> the table values, or when a new architectural axis surfaces.
|
||||
> **Last meaningful refresh**: 2026-05-01 (post-Rust-cache + Go-validator-port)
|
||||
|
||||
This document compares the two parallel implementations of the lakehouse
|
||||
substrate — Rust at `/home/profit/lakehouse/` (production today), Go at
|
||||
`/home/profit/golangLAKEHOUSE/` (cutover-prep, Bun `/_go/*` slice live).
|
||||
The goal of running both lines is to find where each architecture is
|
||||
weak vs strong, address those gaps, and make the keep/maintain
|
||||
decision based on real evidence rather than preference.
|
||||
|
||||
A snapshot of this document at any point in time is also captured at
|
||||
`reports/cutover/architecture_comparison.md`. The version in `docs/`
|
||||
is the source of truth; `reports/cutover/` is the historical record.
|
||||
|
||||
---
|
||||
|
||||
## How to update this doc
|
||||
|
||||
Three triggers:
|
||||
|
||||
1. **A fix lands on either side that moves a table value.** Update the
|
||||
number, append a one-line entry to the change log at the bottom,
|
||||
commit alongside the fix.
|
||||
2. **A new architectural axis surfaces.** Add a section. Match the
|
||||
shape of existing sections (table + read paragraph).
|
||||
3. **A keep/maintain decision is made.** Update the Recommendation
|
||||
section + change log.
|
||||
|
||||
Don't:
|
||||
- Delete sections without recording the reason in the change log.
|
||||
- Embed unverified claims — every "Rust is X" or "Go is X" should
|
||||
point to either a load-test number, a code reference (`crate/file:line`),
|
||||
or an explicit "asserted, not measured" caveat.
|
||||
|
||||
---
|
||||
|
||||
## Decisions tracker
|
||||
|
||||
| Date | Decision | Effect |
|
||||
|---|---|---|
|
||||
| 2026-05-01 | Add LRU embed cache to Rust aibridge | Closes 236× perf gap. **DONE** (commit `150cc3b` in lakehouse). |
|
||||
| 2026-05-01 | Port FillValidator + EmailValidator to Go | Production safety net Go was missing. **DONE** (commit `b03521a` in golangLAKEHOUSE). |
|
||||
| _open_ | Drop Python sidecar from Rust aibridge | Universal-win architectural cleanup. ~200 LOC, removes 1 runtime + 1 process. |
|
||||
| _open_ | Port Rust materializer to Go (transforms.ts) | Unblocks Go-only end-to-end pipeline. ~500-800 LOC. |
|
||||
| _open_ | Port Rust replay tool to Go | Closes audit-FULL phase 7 live invocation. ~400-600 LOC. |
|
||||
| _open_ | Decide on Lance vector backend | Defer until corpus exceeds ~5M rows. |
|
||||
| _open_ | Pick Go primary vs Rust primary | Both viable. Go has perf edge after today; Rust has production deploy + producer-side completeness. |
|
||||
|
||||
---
|
||||
|
||||
## Code volume
|
||||
|
||||
| | Lines | Last verified |
|
||||
|---|---:|---|
|
||||
| Rust `crates/` (15 crates) | 35,447 | 2026-05-01 |
|
||||
| Rust `sidecar/` (Python) | 1,237 | 2026-05-01 |
|
||||
| Go `internal/` (20 packages) | 11,896 (+ validator 1190) | 2026-05-01 |
|
||||
| Go `cmd/` (14 binaries) | 3,232 | 2026-05-01 |
|
||||
| **Go total** | **~16,300** | 2026-05-01 |
|
||||
|
||||
Go is ~46% the size of Rust on like-for-like surface (post-validator-port).
|
||||
The gap is largely `vectord` (Rust 11,005 lines vs Go 804) — Rust's
|
||||
vectord implements HNSW + Lance-format storage + benchmarking; Go's
|
||||
wraps `coder/hnsw` and stops there.
|
||||
|
||||
---
|
||||
|
||||
## Process model
|
||||
|
||||
| | Rust | Go |
|
||||
|---|---|---|
|
||||
| Binaries running | **1** mega-process (gateway PID 1241, 14.9G RSS, 374% CPU under load) | **11** dedicated daemons (~100-300MB RSS each) |
|
||||
| Inter-component comms | In-process axum.nest (no network) | HTTP between daemons |
|
||||
| Crash blast radius | Whole system if any subsystem panics | One daemon dies, rest survive |
|
||||
| Horizontal scale | One unit only — can't scale individual components | Each daemon scales independently |
|
||||
| Deploy unit | Single binary | 11 systemd units |
|
||||
|
||||
**Reading**: Rust's mega-binary is simpler ops at small scale (one
|
||||
thing to start, one log to tail). Go's daemons are simpler ops at
|
||||
production scale (kill the misbehaving one, restart it, others stay
|
||||
up). Go also lets you tune per-daemon resource limits via systemd.
|
||||
|
||||
---
|
||||
|
||||
## Python dependency (the load-bearing axis)
|
||||
|
||||
This is the architectural difference that drove the original perf gap.
|
||||
Both call Ollama at `:11434`, but the path differs:
|
||||
|
||||
```
|
||||
Rust embed: gateway → HTTP → Python sidecar :3200 → HTTP → Ollama :11434
|
||||
Go embed: gateway → HTTP → Go embedd :4216 → HTTP → Ollama :11434
|
||||
```
|
||||
|
||||
The Python sidecar (`sidecar/sidecar/main.py`, 1,237 lines) is a
|
||||
FastAPI wrapper around Ollama. It does pydantic validation + request
|
||||
shaping; **no fundamental compute** that Ollama can't do directly.
|
||||
|
||||
### Performance impact (load-tested 2026-05-01, 6 rotating bodies, 10 concurrency, 30s)
|
||||
|
||||
| Path | Pre-cache | Post-cache (`150cc3b`) | Δ |
|
||||
|---|---:|---:|---:|
|
||||
| **Rust /ai/embed** (via gateway) | 128 RPS · p50 78ms · p99 124ms | **30,279 RPS · p50 129µs · p99 5ms** | +236× RPS |
|
||||
| **Go /v1/embed** (via gateway → embedd) | 8,119 RPS · p50 0.79ms · p99 3ms | _unchanged_ | (already cached) |
|
||||
|
||||
Rust now beats Go ~3.7× on cache-warm workloads. The cache being
|
||||
in-process inside Rust's gateway (no HTTP hop to a separate daemon)
|
||||
gives it the edge once both sides have caching.
|
||||
|
||||
### What the cache fix did NOT do
|
||||
|
||||
The Python sidecar is still in the Rust path on cache misses. Cold
|
||||
queries pay the full Python+Ollama tax. Dropping the sidecar
|
||||
(rewriting aibridge to call Ollama directly) is the next universal-win
|
||||
item — open in the Decisions tracker.
|
||||
|
||||
---
|
||||
|
||||
## Vector storage
|
||||
|
||||
| | Rust | Go |
|
||||
|---|---|---|
|
||||
| HNSW lib | `hnsw_rs` (mature) | `coder/hnsw` (newer, smaller) |
|
||||
| Code size | 11,005 lines (`vectord` + `vectord-lance`) | 804 lines |
|
||||
| Lance-format storage | Yes (`vectord-lance` crate) | No |
|
||||
| Persistence | LanceDB or in-memory | MinIO + JSON envelope (v2 envelope as of `eb0dfdf`) |
|
||||
| Distance functions | cosine, euclidean, dot product | cosine, euclidean |
|
||||
|
||||
**Reading**: Rust has the deeper substrate. Lance-format gives columnar
|
||||
persistence + zero-copy reads + Apache Arrow integration. For
|
||||
staffing-domain corpus sizes (5K-500K vectors) both work fine; for
|
||||
multi-million-row indexes Rust would have a real edge. **Defer the Go
|
||||
Lance port until corpus growth demands it.**
|
||||
|
||||
---
|
||||
|
||||
## Distillation pipeline (porting status)
|
||||
|
||||
| Phase | Rust source | Go port |
|
||||
|---|---|---|
|
||||
| Materializer (transforms.ts) | TS, full | ❌ NOT YET PORTED |
|
||||
| Scorer | TS + Go | ✅ Ported |
|
||||
| Score categories + firewall | Pinned | ✅ Ported (`SftNever`) |
|
||||
| SFT export (synthesis) | TS, full (8 source classes) | ✅ Fully ported, 4-decimal byte-equal |
|
||||
| RAG export | TS | ❌ NOT YET PORTED |
|
||||
| Preference export | TS | ❌ NOT YET PORTED |
|
||||
| Audit-baselines | TS | ✅ Fully ported, byte-equal verified |
|
||||
| Audit-FULL phase 0/3/4 | TS | ✅ Ported |
|
||||
| Audit-FULL phase 1 (schema) | bun test | ✅ Via `go test` exec |
|
||||
| Audit-FULL phase 2 (materializer) | TS | ✅ Observer mode (read-only) |
|
||||
| Audit-FULL phase 5 (run summaries) | TS | ✅ Observer mode (read-only) |
|
||||
| Audit-FULL phase 6 (acceptance) | TS fixture harness | ❌ Skipped (TS-only deps) |
|
||||
| Audit-FULL phase 7 (replay) | TS | ✅ Observer mode (read-only) |
|
||||
| Replay tool | TS | ❌ NOT YET PORTED |
|
||||
| Quarantine writer | TS | ❌ NOT YET PORTED |
|
||||
|
||||
**Reading**: Go has the substrate for everything observable (read
|
||||
paths) and SFT export end-to-end. The producer side (materializer,
|
||||
replay) is still Rust-only. To run the full pipeline from Go alone,
|
||||
the materializer + replay need porting.
|
||||
|
||||
---
|
||||
|
||||
## Production validators
|
||||
|
||||
| | Rust | Go |
|
||||
|---|---|---|
|
||||
| FillValidator | `crates/validator/src/staffing/fill.rs` (12 unit tests) | ✅ **Ported 2026-05-01** (`internal/validator/fill.go` + 13 tests) |
|
||||
| EmailValidator | `crates/validator/src/staffing/email.rs` (12 tests) | ✅ **Ported 2026-05-01** (`internal/validator/email.go` + 11 tests) |
|
||||
| `/v1/validate` endpoint | Yes | ❌ NOT YET PORTED (validator network surface) |
|
||||
| `/v1/iterate` endpoint | Yes (gen→validate→correct→retry loop) | ❌ NOT YET PORTED |
|
||||
| Production validators load `workers_500k.parquet` at startup | Yes (75MB resident) | N/A — Go uses WorkerLookup interface; in-memory or adapter |
|
||||
|
||||
**Reading**: With today's port, Go has the load-bearing validators.
|
||||
The network surface (`/v1/validate`, `/v1/iterate`) is the next
|
||||
piece — the in-memory validators work in-process; turning them into
|
||||
HTTP endpoints adds the production-shape access pattern.
|
||||
|
||||
---
|
||||
|
||||
## Substrate features unique to each side
|
||||
|
||||
### Go has, Rust doesn't
|
||||
|
||||
- **chatd 5-provider dispatcher** (kimi / opencode / openrouter / ollama_cloud / ollama).
|
||||
- **Cross-role gate** in matrix retrieve (real_001 fix). Verified by reality tests real_001..005.
|
||||
- **Multi-corpus matrix indexer** (Spec §3.4 component 2).
|
||||
- **Pathway memory** (Mem0-style versioned traces).
|
||||
- **Observer fail-safe semantics** (ADR-005 Decision 5.1).
|
||||
- **In-process embed cache** (CachedProvider + LRU). _Note: Rust got this 2026-05-01 too._
|
||||
- **LLM-based role extractor** (regex + qwen2.5 fallback).
|
||||
- **Persistent stack 3-layer isolation** (`scripts/cutover/start_go_stack.sh`).
|
||||
- **Cutover slice** (Bun `/_go/*` route, opt-in via systemd drop-in).
|
||||
- **Production load test** (`scripts/cutover/loadgen/`) with Bun-frontend + direct comparison.
|
||||
|
||||
### Rust has, Go doesn't
|
||||
|
||||
- **Lance-format vector storage** (vectord-lance crate, 605 lines).
|
||||
- **`truth` crate** (970 lines). Cross-source claim reconciliation.
|
||||
- **`journald` crate** (455 lines). Structured event journal.
|
||||
- **`/v1/validate` + `/v1/iterate` endpoints** (network surface).
|
||||
- **`ui` crate (Dioxus, 1,509 lines)**. Native desktop/web UI.
|
||||
- **Materializer + replay tools** (the "produce evidence" side).
|
||||
- **Acceptance harness** (22 invariants over fixtures, TS).
|
||||
- **Production deployment** (devop.live/lakehouse/* serves through Rust today).
|
||||
|
||||
---
|
||||
|
||||
## Strengths and weaknesses
|
||||
|
||||
### Rust strengths
|
||||
|
||||
- Mature, in production, serving real demo traffic.
|
||||
- Single deploy unit; one binary, one systemd service, one log.
|
||||
- Type system + memory safety; fewer runtime bugs in hot paths.
|
||||
- Mature library ecosystem (axum, tokio, polars, arrow, hnsw_rs, lance).
|
||||
- Native distillation pipeline; Go is the porter.
|
||||
- Production validators (now also in Go but Rust authored them).
|
||||
- Lance vector storage scales beyond 5M rows.
|
||||
- **In-process embed cache (post-`150cc3b`) makes Rust the fastest path on warm workloads.**
|
||||
|
||||
### Rust weaknesses
|
||||
|
||||
- **Python sidecar dependency** — every cache-miss AI call goes through Python. Adds 1 runtime + 1 process to ops. ~200 LOC to fix.
|
||||
- **Mega-binary blast radius** — gateway at 14.9G RSS means any panic kills the whole production system.
|
||||
- **Tail latency cliff under uncached load** — single async runtime serializes I/O completions.
|
||||
- **Compile times** — slow iteration vs Go's per-package builds.
|
||||
- **Coupling** — adding a feature touches gateway/v1/ and ripples across crates.
|
||||
|
||||
### Go strengths
|
||||
|
||||
- **Process isolation** — daemons crash independently; ops can `systemctl restart vectord` without touching gateway.
|
||||
- **Per-daemon scale** — embed cache lives in embedd; vectord shards independently. Hot daemons scale horizontally.
|
||||
- **No Python dependency** — every daemon talks to peers in HTTP/JSON. Native Go down to Ollama.
|
||||
- **In-process embed cache** at the daemon level (was the perf lever pre-Rust-cache).
|
||||
- **Smaller, denser code** — 16,300 lines vs Rust's 35,447 + 1,237 sidecar (~46% the size).
|
||||
- **Faster iteration** — `go build` of all 14 binaries is ~3-5s; Rust full rebuild is minutes.
|
||||
- **Cross-runtime artifact compatibility verified** — audit_baselines.jsonl, scored-runs JSONL, sft_export.jsonl all round-trip byte-equal.
|
||||
|
||||
### Go weaknesses
|
||||
|
||||
- **Distillation pipeline incomplete** — materializer + replay + RAG export + preference export still Rust-only.
|
||||
- **Validator network surface missing** — in-memory validators work, but `/v1/validate` HTTP endpoint not yet ported. Operators can't call validators over the wire from Go.
|
||||
- **Vector storage HNSW-only** — no Lance equivalent. Fine for current scale.
|
||||
- **Less production-tested** — cutover slice live but no real coordinator traffic yet.
|
||||
- **HTTP between daemons** — every cross-daemon call is a network round-trip. Latency fine on localhost (microseconds) but tail-latency contributes more than Rust's in-process composition.
|
||||
- **`coder/hnsw` is newer** than Rust's `hnsw_rs`. Less battle-tested.
|
||||
|
||||
---
|
||||
|
||||
## Cross-cutting abstracts to address
|
||||
|
||||
The list below is a working backlog. Move items to "Decisions tracker"
|
||||
(at top) when actioned with a commit reference.
|
||||
|
||||
### Universal wins (apply regardless of primary line)
|
||||
|
||||
1. ✅ **Embed cache in Rust aibridge** — DONE 2026-05-01 (`150cc3b`).
|
||||
2. ✅ **FillValidator + EmailValidator in Go** — DONE 2026-05-01 (`b03521a`).
|
||||
3. **Drop Python sidecar from Rust** — Rewrite aibridge to call Ollama at `:11434/api/embed` and `/api/generate` directly. Removes 1 runtime + 1 process from ops. ~200 LOC.
|
||||
4. **Cross-runtime contract tests** — Pin shared JSONL schemas (audit_baselines, scored_run, sft_sample) as canonical specs in `auditor/schemas/` with Go-side validators consuming the same definitions.
|
||||
|
||||
### If keeping Go primary
|
||||
|
||||
5. **Port materializer** (highest leverage — unblocks full Go pipeline). ~500-800 LOC.
|
||||
6. **Port replay tool** (closes audit-FULL phase 7 live invocation). ~400-600 LOC.
|
||||
7. **Port `/v1/validate` + `/v1/iterate` HTTP surface** for the now-Go-side validators. ~200 LOC.
|
||||
8. **Skip Lance** until corpus growth demands it (>5M rows).
|
||||
9. **Keep chatd, observer fail-safe, role gate, multi-corpus matrix** — real Go wins worth preserving.
|
||||
|
||||
### If keeping Rust primary
|
||||
|
||||
10. **Port chatd's 5-provider dispatcher to Rust** — unified cloud LLM access.
|
||||
11. **Port the cross-role gate to Rust matrix retrieve** — production safety on the matrix layer (verified by Go reality tests real_001..005).
|
||||
12. **Consider process splitting** — even partial decomposition (split out vectord into its own process) would help with the mega-binary blast radius.
|
||||
|
||||
---
|
||||
|
||||
## Recommendation (working hypothesis)
|
||||
|
||||
**Go for the primary line, Rust for production-bridge maintenance.**
|
||||
|
||||
Reasons:
|
||||
1. **Operations** — process isolation is genuinely simpler at production scale than a 14.9G mega-binary.
|
||||
2. **Code volume** — Go does the same job in ~46% the lines.
|
||||
3. **Cross-runtime parity verified** — every artifact round-trips byte-equal between runtimes.
|
||||
4. **The 4 missing pieces are bounded** — materializer + replay + validators-network + RAG/preference exports are concrete porting targets, not research questions.
|
||||
5. **Performance is no longer a deciding factor** post-`150cc3b` — Rust is faster on warm cache, but both are well above staffing-domain demand levels (<1 RPS typical).
|
||||
|
||||
But **don't abandon Rust**:
|
||||
1. devop.live/lakehouse/ runs through Rust today; cutover is multi-week.
|
||||
2. Several Go improvements would be downstream of Rust patterns. Keeping Rust live means anything new there is a porting opportunity for Go.
|
||||
3. The Python sidecar drop + cross-role gate port are valuable Rust improvements regardless of which line is primary.
|
||||
|
||||
---
|
||||
|
||||
## Change log
|
||||
|
||||
Append entries here when this doc gets updated. One-line entries; link to commits.
|
||||
|
||||
- 2026-05-01 — Initial draft (`b3ad148` golangLAKEHOUSE).
|
||||
- 2026-05-01 — Recorded Rust embed cache shipping (`150cc3b` lakehouse), updated Python-dependency section + table.
|
||||
- 2026-05-01 — Recorded Go validator port shipping (`b03521a` golangLAKEHOUSE), updated production-validators section.
|
||||
- 2026-05-01 — Reframed as living document in `docs/`, added Decisions tracker + Update guidance + Change log sections.
|
||||
|
||||
---
|
||||
|
||||
## See also
|
||||
|
||||
- **`reports/cutover/architecture_comparison.md`** — historical snapshot (matched this doc as of the date stamp at top).
|
||||
- **`docs/SPEC.md`** — Go-side architectural spec.
|
||||
- **`docs/DECISIONS.md`** — Go-side ADRs.
|
||||
- **`/home/profit/lakehouse/docs/DECISIONS.md`** — Rust-side ADRs.
|
||||
- **`/home/profit/lakehouse/docs/go-rewrite/`** — Rust-side notes on the rewrite.
|
||||
- **`reports/cutover/SUMMARY.md`** — running log of cross-runtime parity probes.
|
||||
- **`reports/cutover/g5_load_test.md`** — load-test methodology + numbers.
|
||||
@ -1,4 +1,9 @@
|
||||
# Lakehouse: Rust vs Go architecture comparison
|
||||
# Lakehouse: Rust vs Go architecture comparison (snapshot)
|
||||
|
||||
> **THIS IS A SNAPSHOT — NOT THE SOURCE OF TRUTH.**
|
||||
> The living document is at **`docs/ARCHITECTURE_COMPARISON.md`**.
|
||||
> Update there; this file is a frozen historical record.
|
||||
> Snapshot date: 2026-05-01.
|
||||
|
||||
Produced 2026-05-01 to inform the keep/maintain decision and surface
|
||||
abstractions that should be addressed regardless of which side is the
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user