golangLAKEHOUSE/docs/RUST_PATHWAY_MEMORY_NOTE.md
Claw f07668064e docs: seed PRD + SPEC for the Go-direction rewrite
Two documents only — no Go code yet. PRD restates the problem and
preserves the Rust PRD's invariants verbatim, then maps the locked
stack to Go libraries and surfaces four hard problems (DuckDB-via-cgo
for the query engine, Lance dropped, Dioxus → HTMX, arrow-go maturity).
SPEC walks each Rust crate + TS surface and tags the port with library
choice / effort estimate / risk + a 5-phase migration plan from
skeleton (Phase G0) to demo parity (Phase G5).

Six open questions remain that gate Phase G0:
- DuckDB cgo OK?
- HTMX vs React for the UI?
- Repo location?
- Distillation v1.0.0 port verbatim or rebuild?
- Pathway memory data — port 88 traces or start clean?
- Auditor lineage — port audit_baselines.jsonl or restart?

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 06:35:23 -05:00

80 lines
3.2 KiB
Markdown

# Rust Pathway Memory — Historical Reference
**Status:** Reference-only. The Go Lakehouse does NOT load these
traces (per ADR-001 §1.5). This note exists so future-Go-engineer
knows what the Rust era accumulated, where it lives, and why it was
left in place.
---
## What was there
By the time of the rewrite cutoff (commit `dcf4c9a`,
2026-04-28), the Rust pathway memory held:
- **88 traces** at `/home/profit/lakehouse/data/_pathway_memory/state.json`
- **11/11 successful replays** as of the most recent verification (the
"probation gate crossed" signal from the lakehouse `STATE_OF_PLAY.md`)
- Active scrum-cycle compounding: each scrum loop iteration appended
new traces and re-ran replays against existing pathway fingerprints
to preempt review prompts with "this file pattern has produced bug X
before"
## Where it lives (Rust repo)
```
lakehouse/
├── crates/vectord/src/pathway_memory.rs ← implementation
├── data/_pathway_memory/state.json ← 88 traces, JSON
└── docs/DECISIONS.md ADR-021 ← matrix-correctness layer design
```
The TS-side mirror lived in
`tests/real-world/scrum_master_pipeline.ts` (functions
`computePathwayId`, `buildPathwayVec`). Both implementations
byte-matched on bucket vectors.
## Why this matters for the Go port
The pathway memory's *algorithm* is portable — 32-bucket SHA256-keyed
token hash, JSON state file, replay logic. The pathway memory's
*signal value* is not — those 88 traces represent months of scrum
loops on Rust code, with bug fingerprints anchored to Rust file
prefixes (`crates/queryd/`, `crates/vectord/`, etc.) that don't exist
in the Go repo.
Per ADR-001 §1.5, the Go pathway memory:
1. Reimplements the algorithm (SPEC §3.4 G3.4.B is the byte-match
correctness gate).
2. Starts with zero traces. The 88 Rust traces are NOT migrated.
3. Builds its own signal over Go-era scrum cycles.
## What to do if the Go pathway memory underperforms
If after Phase G3 the Go pathway memory shows a noticeable lift
deficit vs. the Rust era's "11/11 successful replays" baseline:
1. **First** — verify the Go algorithm byte-matches the Rust one on
the SPEC G3.4.B golden input. If yes, the algorithm is correct and
the gap is data-volume, not implementation.
2. **Second** — the Rust traces exist; if needed, re-prefix file paths
from `crates/queryd/` style to `cmd/queryd/` style, run a
compatibility check, and seed the Go pathway memory selectively. But
only after the algorithm is proven byte-match correct.
3. **Third** — accept that the first ~3 months of Go scrum cycles need
to rebuild the signal naturally. This is the cost of the clean
restart per ADR-001 §1.5.
## Historical baseline (frozen reference)
| Metric | Rust value at cutoff | Source |
|---|---|---|
| Total traces | 88 | `data/_pathway_memory/state.json` |
| Successful replays | 11/11 | scrum loop log circa 2026-04-26 |
| Distinct file prefixes | TBD — query the state file | n/a |
| Distinct semantic_flag variants used | 9 (per ADR-021) | `pathway_memory.rs` |
| Distinct bug_fingerprint hashes | TBD | `pathway_memory.rs` |
When the Go pathway memory reaches comparable numbers, it has caught
up to the Rust era and can be considered fully replacement-grade.