golangLAKEHOUSE/docs/RUST_PATHWAY_MEMORY_NOTE.md
Claw f07668064e docs: seed PRD + SPEC for the Go-direction rewrite
Two documents only — no Go code yet. PRD restates the problem and
preserves the Rust PRD's invariants verbatim, then maps the locked
stack to Go libraries and surfaces four hard problems (DuckDB-via-cgo
for the query engine, Lance dropped, Dioxus → HTMX, arrow-go maturity).
SPEC walks each Rust crate + TS surface and tags the port with library
choice / effort estimate / risk + a 5-phase migration plan from
skeleton (Phase G0) to demo parity (Phase G5).

Six open questions remain that gate Phase G0:
- DuckDB cgo OK?
- HTMX vs React for the UI?
- Repo location?
- Distillation v1.0.0 port verbatim or rebuild?
- Pathway memory data — port 88 traces or start clean?
- Auditor lineage — port audit_baselines.jsonl or restart?

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 06:35:23 -05:00

3.2 KiB

Rust Pathway Memory — Historical Reference

Status: Reference-only. The Go Lakehouse does NOT load these traces (per ADR-001 §1.5). This note exists so future-Go-engineer knows what the Rust era accumulated, where it lives, and why it was left in place.


What was there

By the time of the rewrite cutoff (commit dcf4c9a, 2026-04-28), the Rust pathway memory held:

  • 88 traces at /home/profit/lakehouse/data/_pathway_memory/state.json
  • 11/11 successful replays as of the most recent verification (the "probation gate crossed" signal from the lakehouse STATE_OF_PLAY.md)
  • Active scrum-cycle compounding: each scrum loop iteration appended new traces and re-ran replays against existing pathway fingerprints to preempt review prompts with "this file pattern has produced bug X before"

Where it lives (Rust repo)

lakehouse/
├── crates/vectord/src/pathway_memory.rs      ← implementation
├── data/_pathway_memory/state.json            ← 88 traces, JSON
└── docs/DECISIONS.md ADR-021                  ← matrix-correctness layer design

The TS-side mirror lived in tests/real-world/scrum_master_pipeline.ts (functions computePathwayId, buildPathwayVec). Both implementations byte-matched on bucket vectors.

Why this matters for the Go port

The pathway memory's algorithm is portable — 32-bucket SHA256-keyed token hash, JSON state file, replay logic. The pathway memory's signal value is not — those 88 traces represent months of scrum loops on Rust code, with bug fingerprints anchored to Rust file prefixes (crates/queryd/, crates/vectord/, etc.) that don't exist in the Go repo.

Per ADR-001 §1.5, the Go pathway memory:

  1. Reimplements the algorithm (SPEC §3.4 G3.4.B is the byte-match correctness gate).
  2. Starts with zero traces. The 88 Rust traces are NOT migrated.
  3. Builds its own signal over Go-era scrum cycles.

What to do if the Go pathway memory underperforms

If after Phase G3 the Go pathway memory shows a noticeable lift deficit vs. the Rust era's "11/11 successful replays" baseline:

  1. First — verify the Go algorithm byte-matches the Rust one on the SPEC G3.4.B golden input. If yes, the algorithm is correct and the gap is data-volume, not implementation.
  2. Second — the Rust traces exist; if needed, re-prefix file paths from crates/queryd/ style to cmd/queryd/ style, run a compatibility check, and seed the Go pathway memory selectively. But only after the algorithm is proven byte-match correct.
  3. Third — accept that the first ~3 months of Go scrum cycles need to rebuild the signal naturally. This is the cost of the clean restart per ADR-001 §1.5.

Historical baseline (frozen reference)

Metric Rust value at cutoff Source
Total traces 88 data/_pathway_memory/state.json
Successful replays 11/11 scrum loop log circa 2026-04-26
Distinct file prefixes TBD — query the state file n/a
Distinct semantic_flag variants used 9 (per ADR-021) pathway_memory.rs
Distinct bug_fingerprint hashes TBD pathway_memory.rs

When the Go pathway memory reaches comparable numbers, it has caught up to the Rust era and can be considered fully replacement-grade.