Two documents only — no Go code yet. PRD restates the problem and preserves the Rust PRD's invariants verbatim, then maps the locked stack to Go libraries and surfaces four hard problems (DuckDB-via-cgo for the query engine, Lance dropped, Dioxus → HTMX, arrow-go maturity). SPEC walks each Rust crate + TS surface and tags the port with library choice / effort estimate / risk + a 5-phase migration plan from skeleton (Phase G0) to demo parity (Phase G5). Six open questions remain that gate Phase G0: - DuckDB cgo OK? - HTMX vs React for the UI? - Repo location? - Distillation v1.0.0 port verbatim or rebuild? - Pathway memory data — port 88 traces or start clean? - Auditor lineage — port audit_baselines.jsonl or restart? Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3.2 KiB
Rust Pathway Memory — Historical Reference
Status: Reference-only. The Go Lakehouse does NOT load these traces (per ADR-001 §1.5). This note exists so future-Go-engineer knows what the Rust era accumulated, where it lives, and why it was left in place.
What was there
By the time of the rewrite cutoff (commit dcf4c9a,
2026-04-28), the Rust pathway memory held:
- 88 traces at
/home/profit/lakehouse/data/_pathway_memory/state.json - 11/11 successful replays as of the most recent verification (the
"probation gate crossed" signal from the lakehouse
STATE_OF_PLAY.md) - Active scrum-cycle compounding: each scrum loop iteration appended new traces and re-ran replays against existing pathway fingerprints to preempt review prompts with "this file pattern has produced bug X before"
Where it lives (Rust repo)
lakehouse/
├── crates/vectord/src/pathway_memory.rs ← implementation
├── data/_pathway_memory/state.json ← 88 traces, JSON
└── docs/DECISIONS.md ADR-021 ← matrix-correctness layer design
The TS-side mirror lived in
tests/real-world/scrum_master_pipeline.ts (functions
computePathwayId, buildPathwayVec). Both implementations
byte-matched on bucket vectors.
Why this matters for the Go port
The pathway memory's algorithm is portable — 32-bucket SHA256-keyed
token hash, JSON state file, replay logic. The pathway memory's
signal value is not — those 88 traces represent months of scrum
loops on Rust code, with bug fingerprints anchored to Rust file
prefixes (crates/queryd/, crates/vectord/, etc.) that don't exist
in the Go repo.
Per ADR-001 §1.5, the Go pathway memory:
- Reimplements the algorithm (SPEC §3.4 G3.4.B is the byte-match correctness gate).
- Starts with zero traces. The 88 Rust traces are NOT migrated.
- Builds its own signal over Go-era scrum cycles.
What to do if the Go pathway memory underperforms
If after Phase G3 the Go pathway memory shows a noticeable lift deficit vs. the Rust era's "11/11 successful replays" baseline:
- First — verify the Go algorithm byte-matches the Rust one on the SPEC G3.4.B golden input. If yes, the algorithm is correct and the gap is data-volume, not implementation.
- Second — the Rust traces exist; if needed, re-prefix file paths
from
crates/queryd/style tocmd/queryd/style, run a compatibility check, and seed the Go pathway memory selectively. But only after the algorithm is proven byte-match correct. - Third — accept that the first ~3 months of Go scrum cycles need to rebuild the signal naturally. This is the cost of the clean restart per ADR-001 §1.5.
Historical baseline (frozen reference)
| Metric | Rust value at cutoff | Source |
|---|---|---|
| Total traces | 88 | data/_pathway_memory/state.json |
| Successful replays | 11/11 | scrum loop log circa 2026-04-26 |
| Distinct file prefixes | TBD — query the state file | n/a |
| Distinct semantic_flag variants used | 9 (per ADR-021) | pathway_memory.rs |
| Distinct bug_fingerprint hashes | TBD | pathway_memory.rs |
When the Go pathway memory reaches comparable numbers, it has caught up to the Rust era and can be considered fully replacement-grade.