root 511083ae40 docs: SPEC §3.9 (chatd) + §3.10 (local-review-harness sibling)

- SPEC §1 component table: add chatd row marked DONE; replaces
  Rust gateway's v1::ollama_cloud / openrouter / opencode adapters
  + the aibridge crate.
- SPEC §3.9 — chatd shipped: 5-provider routing (ollama, ollama_cloud,
  openrouter, opencode, kimi) by model-name prefix or :cloud suffix.
  Captures the Anthropic 4.7 temperature-deprecation quirk + the
  local-Ollama think=false default that the playbook_lift judge
  needed. Mentions scrum_review.sh as the reusable cross-lineage
  vehicle eating chatd's own /v1/chat.
- SPEC §3.10 — local-review-harness sibling tool: separate repo at
  git.agentview.dev/profit/local-review-harness, MVP shipped today.
  Documents the cross-pollination plan for when both substrates
  stabilize (chatd as the harness's LLM backend; harness findings
  as Lakehouse pathway-memory drift signal; .memory/known-risks
  as a matrix corpus). Explicit "don't re-port" so future Claudes
  don't try to absorb the harness into Lakehouse.
- STATE_OF_PLAY.md: SIBLING TOOLS section with 1-line summary
  + pointer to SPEC §3.10.

No code changes. just verify still PASS — touched only docs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-30 01:01:23 -05:00

34 KiB

Raw Permalink Blame History

SPEC: Lakehouse-Go Component Port Plan

Status: DRAFT — companion to PRD.md. Component-by-component port plan with library choices, effort estimates, and acceptance gates. Created: 2026-04-28 Owner: J

This spec answers: for each piece of the Rust Lakehouse, what Go library carries it, what the effort looks like, and what gate proves the port is real.

Effort scale (one engineer-week = ~40h focused work):

S — 1–3 days
M — 1 engineer-week
L — 2–3 engineer-weeks
XL — 1+ months
HARD — open research, see PRD §Hard problems

§1. Component port table — Rust crates

Crate	Rust deps that mattered	Go target	Library	Effort	Risk
`gateway`	axum, tokio, tonic, tower	`cmd/gateway`	`chi` + stdlib `net/http` + `google.golang.org/grpc`	L	low — Go's strongest domain
`catalogd`	parquet-rs, arrow, sqlite	`cmd/catalogd`	`apache/arrow-go/v18`, `mattn/go-sqlite3`	L	low
`storaged`	object_store, aws-sdk	`cmd/storaged`	`aws-sdk-go-v2`, `minio-go` for MinIO-specific paths	M	low
`queryd`	datafusion, arrow	`cmd/queryd`	`duckdb/duckdb-go/v2` (cgo, official)	HARD	high — see §3
`ingestd`	csv, json, lopdf, postgres	`cmd/ingestd`	stdlib `encoding/csv`, `encoding/json`, `pdfcpu/pdfcpu`, `jackc/pgx/v5`	L	low
`vectord`	hora, arrow, hnsw	`cmd/vectord`	`coder/hnsw`, `apache/arrow-go/v18`	L	medium — re-validate HNSW recall
matrix indexer (emergent in Rust — `mode.rs` + `build_*_corpus.ts` + observer `/relevance`)	scripts/build_*_corpus.ts, crates/gateway/src/v1/mode.rs, mcp-server/observer.ts	`internal/matrix/` + gateway routes (`/v1/matrix/*`)	stdlib + vectord client	L	medium — see §3.4. Corpus-as-shard composer; relevance filter; strong-model downgrade gate; multi-corpus retrieve+merge. The learning-loop layer that lifts vectord from "static index" to "meta-index that learns from playbooks."
`vectord-lance`	lance	DROPPED	n/a	n/a	n/a — Parquet+HNSW only
`journald`	parquet, arrow	`cmd/journald`	`apache/arrow-go/v18`	M	low
`aibridge`	reqwest	library	`net/http` + connection pool · `anthropics/anthropic-sdk-go` available for direct Claude calls (currently routed via opencode)	S	low
chatd (Phase 4 — multi-provider LLM dispatcher)	crates/gateway/src/v1/{ollama_cloud,openrouter,opencode}.rs	`cmd/chatd` + `internal/chat/`	stdlib `net/http` only	DONE	medium — see §3.9. 5-provider routing (ollama / ollama_cloud / openrouter / opencode / kimi) by model-name prefix or `:cloud` suffix. Replaces the Rust gateway's `v1::` adapters.
`validator`	parquet, custom	library	`apache/arrow-go/v18` parquet reader	M	low — port the 24 unit tests as gates
`truth`	tomli, custom DSL	library	`pelletier/go-toml/v2`	M	low
`proto`	tonic-build	`proto/` + `protoc-gen-go`	`buf` + `protoc-gen-go-grpc`	S	low
`shared`	serde, anyhow	library	stdlib `encoding/json`, `errors`	S	low
`ui`	dioxus, wasm	REPLACED	`html/template` + HTMX	L	medium — see §3
`lance-bench`	criterion	n/a — dropped with Lance	n/a	n/a	n/a

Total Rust crate port effort: ~12–18 engineer-weeks (3–4 months for one engineer; 6–8 weeks for two).

§2. Component port table — TypeScript surfaces

TS surface	Current location	Go target	Library	Effort	Risk
`mcp-server/index.ts`	Bun, :3700	`cmd/mcp`	`modelcontextprotocol/go-sdk` (official Go SDK, v1.5.0, Google-collab)	L	medium — MCP semantics
`mcp-server/observer.ts`	Bun, :3800	`cmd/observer`	stdlib `net/http`, `slog`	M	low
`mcp-server/tracing.ts`	Bun, Langfuse client	library	`go.opentelemetry.io/otel` + Langfuse Go client (or hand-roll)	M	low — Langfuse Go OSS support varies
`auditor/*.ts`	TS, runs as systemd	`cmd/auditor`	stdlib + `gitea API client`	L	medium — auditor cross-lineage logic is intricate
`tests/real-world/scrum_master_pipeline.ts`	TS, ad-hoc	`cmd/scrum`	stdlib	L	medium — chunking + embed + ladder logic
`tests/real-world/scrum_applier.ts`	TS, ad-hoc	`cmd/scrum-apply`	stdlib + git CLI shell-out	M	medium
`bot/propose.ts`	TS	`cmd/bot`	stdlib	S	low
Search demo HTML/JS	static	static (no port)	n/a	n/a	n/a — copied as-is

Total TS port effort: ~6–10 engineer-weeks.

§3. Hard problem details

§3.1 — Query engine (DuckDB via cgo)

Library: github.com/duckdb/duckdb-go/v2 — official Go bindings via cgo. (Replaces the legacy marcboeker/go-duckdb, which was deprecated when the DuckDB team and Marc Boeker jointly relocated maintenance to the DuckDB org at v2.5.0. Migration is a one-line gofmt -r rewrite of import paths.) Current version v2.10502.0 (April 2026), DuckDB v1.5.2 compat. Statically links default extensions: ICU, JSON, Parquet, Autocomplete.

API shape (replaces the DataFusion SessionContext pattern):

db, _ := sql.Open("duckdb", "")
defer db.Close()
db.Exec("CREATE VIEW workers AS SELECT * FROM read_parquet('s3://bucket/workers/*.parquet')")
rows, _ := db.Query("SELECT role, count(*) FROM workers WHERE state='IL' GROUP BY role")

Acceptance gates:

G3.1.A — SELECT * FROM read_parquet('workers_500k.parquet') LIMIT 1 returns a row with the expected schema. Establishes Parquet read works.
G3.1.B — Hybrid SQL+vector query (the POST /vectors/hybrid surface) returns same workers as the Rust path on the same input, ranked the same way modulo embedding precision.
G3.1.C — Hot-cache merge-on-read: register a base table + a delta Parquet, query, observe both rows merged with the delta winning on conflict.

Fallback if cgo is rejected: run DuckDB as an external process (duckdb -json -c '...' shelled or HTTP via a thin Go wrapper). Adds operational surface; preserves SQL model.

§3.2 — HNSW index

Library: coder/hnsw — pure-Go HNSW, in-process. Supports add / delete / search / persist.

Open question: does coder/hnsw match the recall@10 we measured on the Rust hora path? Need a calibration test:

Rebuild lakehouse_arch_v1 (the 1086-chunk arch corpus) in Go.
Compare recall@10 on a fixed query set to the Rust baseline.
Acceptance: ≤2% drop or we switch library / parameters.

Persistence format: TBD — coder/hnsw has its own snapshot format; ADR equivalent of ADR-008 (Parquet for embeddings + sidecar HNSW file) needs revisiting in Go to confirm the sidecar format we ship.

Acceptance gates:

G3.2.A — Build HNSW from a Parquet of 100K vectors in <60s
G3.2.B — Search 100K vectors at k=10 in <50ms p50
G3.2.C — Recall@10 within 2% of Rust baseline on lakehouse_arch_v1

§3.4 — Matrix indexer (corpus-as-shard composer)

What it is. The matrix indexer is the layer above vectord that turns a fleet of single-corpus HNSW indexes into a learning meta-index. In the Rust system this is emergent — split between corpus builders (scripts/build_*_corpus.ts), the mode runner (crates/gateway/src/v1/mode.rs), the observer relevance endpoint (mcp-server/observer.ts), and the strong-model downgrade gate (mode.rs::execute). In Go we name it explicitly so future sessions don't reduce it to "vectord."

Why corpus-as-shard, not shard-by-id. Sharding a single index by hash(id) is a pure throughput hack with a recall tax. Sharding by corpus is the existing retrieval shape — lakehouse_arch_v1, lakehouse_symbols_v1, scrum_findings_v1, lakehouse_answers_v1, kb_team_runs_v1, successful_playbooks_live, etc. — each with distinct topology and a distinct retrieval intent. Concurrent Adds parallelize naturally because they go to different corpora; the matrix layer's job is to retrieve+merge across them, filter for relevance, and downgrade composition when strong models prove the matrix is anti-additive.

Components to port (in dependency order):

Corpus builders — Go equivalents of scripts/build_*_corpus.ts. For each named corpus, a builder that reads source, splits into chunks per the corpus's schema, embeds via /v1/embed, and adds to a vectord index of the same name. Effort: M for the first builder, S for each subsequent.
Multi-corpus retrieve+merge (internal/matrix/retrieve.go) — given a query and a list of corpus names, search each at top_k=K, merge by score, return top N globally. Match Rust's pattern: top_k=6 per corpus, top 8 globally before relevance filter.
Relevance filter (internal/matrix/relevance.go) — port the threshold-based filter from mcp-server/observer.ts:/relevance. Drops adjacency-pollution chunks that share a corpus with the hit but aren't actually about the query. LH_RELEVANCE_FILTER / LH_RELEVANCE_THRESHOLD env knobs preserved.
Strong-model downgrade gate (internal/matrix/downgrade.go) — port is_weak_model + the codereview_lakehouse → codereview_isolation flip from mode.rs::execute. Pass5 proved composed corpora lose 5/5 vs isolation on grok-4.1-fast (p=0.031); the gate is load-bearing for paid-model retrieval quality.
Learning-loop integration — write outcomes back to a playbook-memory corpus (probably lakehouse_answers_v1 analogue). This is what makes the matrix INDEX a learning system rather than static retrieval. Per feedback_meta_index_vision.md: this is the north star, not the data structure.

Gateway routes: /v1/matrix/search (multi-corpus retrieve+merge), /v1/matrix/corpora (list + metadata), /v1/matrix/relevance (filter endpoint, used by both internal callers and external tooling).

Acceptance gates:

G3.4.A — /v1/matrix/search against ≥3 corpora returns merged top-N with corpus attribution per result.
G3.4.B — Relevance filter drops at least the threshold-margin chunks on a known adjacency-pollution test case.
G3.4.C — Strong-model downgrade gate flips composed→isolation when the model is non-weak; bypassed when caller sets force_mode.
G3.4.D — Concurrent Adds across N=4 corpora parallelize (no shared write-lock); Add throughput scales near-linearly with corpus count.

Persistence: each corpus's vectord index persists via the existing G1P LHV1 format. The matrix layer is stateless above that — corpus list lives in catalog, retrieval params in config.

Why this is its own §3.x: in Rust the matrix indexer was emergent and got reduced to "we have vectord" in earlier port-planning. The SPEC names it explicitly so the port preserves the multi-corpus retrieval shape AND the learning loop, not just the HNSW substrate.

§3.5 — Drift quantification (loop 5 of the PRD)

What it is. PRD names "drift" as the 5th loop: quantify when historical decisions stop matching current reality. Distinct from the rating+distillation loop because drift is MEASUREMENT, not LEARNING. The learning loop says "this match worked, remember it"; the drift loop says "this 4-month-old playbook entry — does it still match what the substrate would surface today?"

What's shipped (commit be65f85):

SCORER drift: re-runs current distillation.ScoreRecord over historical (EvidenceRecord, persisted_category) pairs and reports mismatches + a sorted shift matrix
internal/drift/drift.go — pure-function ComputeScorerDrift
6 unit tests covering no-drift, shift detection, multi-shift sorted-by-count, includeEntries flag, empty input, scorer-version stamping

Future drift shapes (not shipped):

PLAYBOOK drift: re-run playbook queries through current matrix-search; recorded answer not in top-K = drift
EMBEDDING drift: KS-test on vector distribution at T1 vs T2
AUDIT BASELINE drift: matches Rust audit_baselines.jsonl longitudinal signal

Acceptance gates:

G3.5.A — A scorer-version bump triggers a non-zero Drifted count on a corpus of historical ScoredRuns where the new logic produces different categories than the persisted ones.
G3.5.B — ScorerDriftReport.ShiftMatrix is deterministic-ordered (count desc, ties broken alphabetically) so JSON output is stable across runs.

§3.6 — Staffing-side structured filter

What it is. Reality tests on the candidates + workers corpora (commits 0d1553c, a97881d) surfaced that pure semantic retrieval can't gate by location/status/availability — the matrix indexer returns Production Workers for a Forklift+OSHA-30 query because nomic-embed-text's geometry doesn't separate the role labels well. Structured filtering is the addressable piece: pre-filter the candidate set on metadata fields BEFORE semantic ranking.

What's shipped (commit b199093):

SearchRequest.MetadataFilter — map[string]any of metadata field → expected value (single value or list-of-values for OR semantics within a key, AND across keys)
Post-retrieval filter applied before top-K truncation in internal/matrix/retrieve.go
SearchResponse.MetadataFilterDropped for telemetry on filter aggressiveness
7 unit tests covering nil filter, missing metadata, exact match, AND across keys, OR within list, bool match, malformed JSON

Deferred:

Pre-retrieval SQL gate via queryd (the actual hybrid). The post-retrieval filter is an MVP that helps when the candidate set is mostly relevant; for aggressive filters that drop most results, a SQL pre-filter into matrix retrieval would surface the right candidates with less wasted embedding work.
Filter language richer than equality (e.g. range, prefix, regex).

Acceptance gates:

G3.6.A — MetadataFilter: {"state": "IL"} against a mixed-state corpus drops every non-IL result; MetadataFilterDropped reports the count.
G3.6.B — List filter {"state": ["IL", "WI"]} keeps both states, drops the rest (OR within key).
G3.6.C — Multi-key filter is AND: a result missing any key is dropped, no exception.

§3.7 — Operational rating wiring

What it is. PRD loop 4 (rating + distillation) needs real inflows to be a learning system rather than a substrate. The playbook-record endpoint (06e7152) takes one (query, answer, score) per call; productizing it into actual signal sources is what makes the system get smarter with use.

What's shipped (commit 6392772):

POST /v1/matrix/playbooks/bulk — bulk-record N successes; per-entry success/failure response so callers can see which of a 4,701-row historical placement import succeeded vs which failed validation.
Single-record path from 06e7152 unchanged.

Deferred:

UI shim for click-tracking (no Go demo UI yet — the Bun demo at devop.live/lakehouse/ is still serving the public surface). When the Go UI lands or a feedback API is added to the Bun UI, every coordinator click → bulk-batched POST → playbook entry.
Negative feedback (this match didn't work). Currently only positive scores are recorded; a rejection signal would help the learning loop avoid pushing bad matches.
Time-decay on playbook scores so stale recommendations attenuate.

Acceptance gates:

G3.7.A — Bulk POST of N entries returns {recorded, failed, results[]} with per-entry IDs/errors, no single-entry failure aborting the batch.
G3.7.B — Each recorded entry surfaces in /v1/matrix/search with use_playbook=true after a re-query.

§3.8 — Observer-KB workflow runner (Archon-style multi-pass)

What it is. The architectural pattern documented in the Rust observer-kb branch (10 commits ahead of main, never merged) and proven by /home/profit/external/Archon's workflow engine. Multiple mode passes processing data, with each pass an objective measurement that contributes to the KB:

Raw data
   ↓ Mode: EXTRACT       structured facts/entities/relationships
   ↓ Mode: VALIDATOR     fact-check, confidence 1-10
   ↓ Mode: HALLUCINATION verify each claim, flag likely fabrications
   ↓ Mode: CONSENSUS     multiple passes until extraction converges
   ↓ Mode: REDTEAM       attack what survived, patch what fails
   ↓ Mode: PIPELINE      clean → Q&A structure → topic group → rank
   ↓ RENDER              curated doc anchored on questions

This is the orchestrator missing from §3.4 components 1-5: each SPEC §3.4 piece (relevance, downgrade, scorer, drift) is a "mode"; what's missing is the workflow engine that chains them.

Why it matters. Per the PRD's product vision: the observer should make actionable decisions based on watching what's successful. The workflow runner is how observers compose modes into multi-pass pipelines that score outcomes rigorously enough to feed the KB and inform the playbook substrate.

Reference materials on the system:

/home/profit/lakehouse/.archon/workflows/lakehouse-architect-review.yaml (committed 69919d9 in main) — proves Archon-via-Lakehouse works with a 3-node shape → weakness → improvement workflow
/home/profit/external/Archon — the upstream workflow engine (cloned 2026-04-26); packages/providers/src/community/pi/provider.ts has the local Lakehouse-routing mod committed locally as 3f2afc8 (not pushed to upstream coleam00/Archon)
Rust observer-kb branch (10 commits, +4338/-55506 LoC) — apps/observer-kb/docs/PRD.md documents the multi-pass architecture; scripts/{deep_analysis,extract_knowledge,process_knowledge}.py are the Python prototypes that proved it on real ChatGPT/Claude PDF data (496 topics, 300 decisions, 100 insights extracted)

Components to port (in dependency order):

Workflow definition (internal/workflow/types.go) — YAML schema matching Archon's shape: name, description, provider, model, list of nodes each with id, prompt, allowed_tools, effort, idle_timeout, depends_on. The depends_on edges form a DAG; the runner resolves topologically.
Node executor (internal/workflow/runner.go) — given a workflow and a starting context, walks the DAG, executes each node by dispatching to the configured backend (matrix.Search, distillation.ScoreRecord, drift.ComputeScorerDrift, or a generic prompt-against-LLM via gateway /v1/chat), captures per-node output, makes it available as $<node_id>.output in subsequent nodes.
Provenance recording — every node execution lands an ObservedOp (via the observerd substrate from bc9ab93) with source: "workflow", the workflow name + node ID, input/output summaries, and timing. The ring buffer + JSONL log become the substrate for the rating+distillation loop's KB feed.
Mode catalog (internal/workflow/modes.go) — registry of the modes the runner can dispatch to. Each mode is a Go function matching a uniform func(ctx, input map[string]any) (map[string]any, error) signature so workflows can compose them. Initial modes from §3.4: matrix.search, matrix.relevance, matrix.downgrade, playbook.record, playbook.lookup, distillation.score, drift.scorer. Plus llm.chat for free-form mode prompts.
HTTP surface — POST /v1/observer/workflow/run accepts a workflow YAML body + a starting context; returns the per-node results + the chain of ObservedOps generated. GET /v1/observer/workflow/list lists workflows in a known directory for operator discoverability.

Why integrate into observerd, not a new service. The observer is the system resource that watches and records. Workflows ARE observation patterns — multi-step processes whose every step is recorded. Putting the runner inside observerd keeps the "measurement → KB feed" wiring tight; a separate service would re-implement the recording layer.

Acceptance gates:

G3.8.A — Load a workflow YAML matching the Archon lakehouse-architect-review.yaml shape; runner executes the 3-node DAG topologically.
G3.8.B — Each node execution lands an ObservedOp with source: "workflow" and the node's input/output. Stats endpoint shows the workflow ops.
G3.8.C — A node referencing $<prior_node>.output in its prompt resolves correctly; missing reference is a clear error not a silent empty string.
G3.8.D — Mode catalog dispatches matrix.search invocation to the matrixd backend without going through HTTP (in-process function call when matrixd is co-resident).

Status: PORT TARGET, not yet started. SPEC commits the design; implementation is its own wave (estimated L effort given the DAG runner + mode dispatch + provenance recording).

§3.9 — chatd (multi-provider LLM dispatcher) — SHIPPED 2026-04-30

Status: done at commit 05273ac (Phase 4 wave) + scrum-hardened at 0efc736. Composite port: 1,624 LoC, 19+ tests, 6/6 chatd_smoke.

What: cmd/chatd on :3220 routes POST /chat to a provider selected by model-name prefix or :cloud suffix:

ollama/<m>            → local Ollama at :11434 (no auth)
ollama_cloud/<m>      → ollama.com /api/generate (Bearer)
<m>:cloud             → ollama_cloud (suffix variant)
openrouter/<v>/<m>    → openrouter.ai (OpenAI-compat, Bearer)
opencode/<m>          → opencode.ai/zen/v1 (OpenAI-compat, Bearer)
kimi/<m>              → api.kimi.com/coding/v1 (OpenAI-compat, Bearer)
bare names            → ollama (default)

Provider key resolution: env var first (OPENROUTER_API_KEY, OPENCODE_API_KEY, KIMI_API_KEY, OLLAMA_CLOUD_KEY); then /etc/lakehouse/<provider>.env fallback (mode 0600). Empty key → provider stays unregistered (404 at first call instead of 503).

Companion to the model tier registry in lakehouse.toml [models] (local_fast / local_judge / cloud_judge / frontier_review / etc.), which maps tier names to model IDs. Callers reference cfg.Models.LocalJudge instead of literal strings. Bumping a tier is a 1-line config edit.

Quirks captured by today's scrum:

Request.Temperature is *float64 (pointer) — Anthropic 4.7 (via OpenCode) rejects the field entirely with "temperature is deprecated." Pointer lets us omit when caller didn't set explicitly.
Local Ollama defaults think=false — qwen3.5:latest is reasoning- capable but the inner-loop hot path wants direct answers, not reasoning traces consuming the token budget.

Replaces the Rust adapters:

crates/gateway/src/v1/ollama_cloud.rs → internal/chat/ollama_cloud.go
crates/gateway/src/v1/openrouter.rs → internal/chat/openai_compat.go (shared with opencode + kimi)
crates/gateway/src/v1/opencode.rs → same shared helper
crates/aibridge/* → internal/chat/ (cleaner abstraction)

Reusable downstream: scripts/scrum_review.sh runs a 3-lineage cross-review (Opus + Kimi + Qwen3-coder) by POST-ing each diff to chatd's /v1/chat. Same vehicle the harness's own scrum-hardening used to find its own bugs.

§3.10 — local-review-harness (sibling tool, separate repo)

Where: git.agentview.dev/profit/local-review-harness (also SMB-mounted at /home/profit/share/local-review-harness-full-md/).

What: a local-first code review harness — walks a target repo, runs evidence-bearing static checks (12 analyzers covering hardcoded paths, raw SQL, wildcard CORS, secrets, exec/spawn, large files, TODO/FIXME, missing tests, .env files committed, exposed mutation endpoints, hardcoded private IPs), produces Scrum-style markdown reports + JSON receipts. No cloud deps. Single static Go binary.

Status: Phase A (skeleton) + Phase B (MVP — static-only path) shipped at first commit f3ee472. 5 acceptance gates green plus self-review (the harness reviews its own repo).

Phases C–E pending: local-Ollama LLM review, validation cross- check, append-only .memory/, diff/rules subcommands.

Why a sibling tool, not a Lakehouse module: PROMPT.md "Strategic Goal" — the harness eventually plugs into OpenClaw / MCP tools / Lakehouse memory / playbook sealing / observer review loop. But first it has to be reliable, inspectable, and evidence-driven on its own. Lakehouse-Go integration is post-Phase-E (probably E+1: write the harness's findings into Lakehouse's playbook substrate via /v1/matrix/playbooks/record).

Cross-pollination opportunities (post-MVP):

Replace internal/llm/ollama.go in the harness with a thin client pointed at chatd's /v1/chat — frontier judges become a config toggle.
Feed harness findings into Lakehouse-Go's pathway memory as a drift signal (which static checks fired this run vs last).
Use the harness's .memory/known-risks.json as a corpus the matrix indexer can retrieve from when the same risk pattern appears.

§3.3 — UI (HTMX)

Approach: server-rendered Go templates using html/template, HTMX for partial-page swaps, Alpine.js for client-side interactivity where needed. Single binary serves API + UI.

Acceptance gates:

G3.3.A — Ask tab: type natural-language question, get answer from RAG endpoint, render in-page without full reload
G3.3.B — Explore tab: paginated dataset list with hot-swap badge rendering
G3.3.C — SQL tab: textarea → submit → tabular result rendered in-page
G3.3.D — System tab: live tail of /storage/errors and /hnsw/trials via HTMX polling

Fallback if HTMX feels limiting: split repo golangLAKEHOUSE-ui with Vite + React, served as static files by Go gateway. Costs an extra repo + build chain.

§3.4 — Pathway memory port

Constraint: the Rust pathway_memory and TS implementations were byte-matching by ADR-021. The byte contract was verified by running both implementations on the same input tokens and asserting matching bucket indices.

Go port plan:

Port the 32-bucket SHA256-keyed token hash exactly. Verify on a golden input that Go produces the same bucket vector as Rust.
Port the JSON state file format verbatim — the existing 88 traces in data/_pathway_memory/state.json reload as-is into the Go implementation.
Port the matrix-correctness layer (ADR-021's SemanticFlag, BugFingerprint, TypeHint) — these are pure value types, trivially portable.

Acceptance gates:

G3.4.A — Load existing state.json, run replay on the same 11 prior successful pathways, all 11 succeed (matching the Rust 11/11 baseline).
G3.4.B — Bucket vector for a fixed test input byte-matches the Rust output.

§4. Phase plan

Phase G0 — Skeleton (Week 1–3)

Scope: smallest end-to-end ingest + query path working in Go.

Component	Deliverable
`cmd/gateway`	HTTP on :3100, `/health`, `/v1/chat` proxy stub
`cmd/catalogd`	In-memory registry + Parquet manifest persistence
`cmd/storaged`	Single-bucket S3 / local FS, no error journal yet
`cmd/ingestd`	CSV → Parquet, schema inference, register-on-ingest
`cmd/queryd`	DuckDB-backed `POST /sql` endpoint

Acceptance: upload a CSV via POST /ingest, query it via POST /sql with a SELECT, get rows back. Single-bucket. No vector, no profile, no UI.

Phase G1 — Vector + RAG (Week 4–6)

Component	Deliverable
`cmd/vectord`	Embed-on-ingest (calls Python sidecar), HNSW build, `POST /search`
`cmd/gateway`	Add `POST /rag` (embed → search → retrieve → generate via aibridge)
`cmd/aibridge`	HTTP client to existing Python sidecar

Acceptance: ingest 15K resumes (the original Phase 7 fixture), ask "find me a forklift operator with OSHA-10 in IL", get ranked results with LLM-generated explanation grounded in the retrieved chunks.

Phase G2 — Federation + profiles (Week 7–8)

Component	Deliverable
`cmd/storaged`	Multi-bucket registry, rescue bucket, error journal at `primary://_errors/`
Profile system	Per-reader profile bound to bucket + vector index
Hot-swap	Atomic pointer swap for index generations

Acceptance: two profiles bound to two buckets, queries scoped correctly, hot-swap a vector index without query interruption, rollback works.

Phase G3 — Pathway memory + distillation (Week 9–11)

Component	Deliverable
`cmd/vectord`	Pathway memory module ported, 88 traces reloaded
Distillation pipeline	SFT export, contamination firewall, scorer
Audit baselines	`audit_baselines.jsonl` longitudinal signal port

Acceptance: replay 11 prior successful pathways, all 11 succeed. Re-run distillation acceptance on the frozen fixture set, 22/22 pass.

Phase G4 — TS surfaces → Go (Week 12–14)

Component	Deliverable
`cmd/mcp`	MCP server (replaces Bun) — `/v1/chat`, intelligence endpoints
`cmd/observer`	Autonomous iteration loop, op recording
`cmd/auditor`	PR audit pipeline (kimi/haiku/opus rotation)
`cmd/scrum`	Scrum master pipeline (replaces TS)

Acceptance: open a test PR, auditor cycles within 90s, emits verdict to data/_auditor/kimi_verdicts/, behavior matches Rust+TS era within tolerance.

Phase G5 — UI + demo parity (Week 15–16)

Component	Deliverable
`cmd/gateway`	Serves HTMX templates + static demo HTML
Demo at `devop.live/lakehouse/`	Parity with current Bun demo
Staffer console at `/console`	Parity

Acceptance: devop.live/lakehouse/ cuts over from Bun to Go gateway. Section ① / ② / ③ all render. Compact contract cards still expand with Project Index. Fill-probability bars still paint.

§5. Repo layout

golangLAKEHOUSE/
├── docs/
│   ├── PRD.md                    ← this PRD
│   ├── SPEC.md                   ← this spec
│   ├── DECISIONS.md              ← Go-era ADRs (start fresh, reference Rust ADRs by number)
│   └── ADR-XXX-*.md              ← per-ADR detail
├── cmd/
│   ├── gateway/                  ← main HTTP/gRPC ingress
│   ├── catalogd/
│   ├── storaged/
│   ├── queryd/
│   ├── ingestd/
│   ├── vectord/
│   ├── journald/
│   ├── mcp/
│   ├── observer/
│   ├── auditor/
│   └── scrum/
├── internal/                     ← shared packages, not exported
│   ├── aibridge/
│   ├── validator/
│   ├── truth/
│   ├── shared/
│   ├── proto/                    ← generated protobuf
│   └── pathway/
├── pkg/                          ← public Go packages (none initially)
├── web/                          ← UI (HTMX templates + static)
│   ├── templates/
│   └── static/
├── scripts/                      ← cold-start, smoke, distill scripts
├── tests/                        ← golden files, integration tests
├── go.mod
├── go.sum
└── README.md

Single Go module. All commands and internal packages live under golangLAKEHOUSE/. No nested modules unless a package needs an independent release cadence (none expected).

Build: go build ./cmd/... produces all binaries.

§6. Migration data plan

What ports verbatim

Parquet datasets at data/datasets/*.parquet — read by Go directly.
Catalog manifests — Parquet, ports as data not code.
Pathway memory state — JSON, ports if §3.4 byte-matching gate passes.

What rebuilds

HNSW indexes — rebuild from Parquet embeddings on first Go startup.
Auditor verdicts on PRs — old PRs won't be re-audited; lineage starts fresh on the new repo's PRs.

What's archived

The Rust crates/ tree — preserved in the original repo at the cutover commit, tagged pre-go-rewrite-2026-04-28 for reference.
TS surfaces (mcp-server/, auditor/, etc.) — preserved in the original repo at the same tag.
Distillation v1.0.0 substrate (tag distillation-v1.0.0, e7636f2) — kept as the historical reference; Go re-implementation ports the LOGIC but not the bit-identical-reproducibility property unless an ADR re-establishes it.

What's discarded

crates/vectord-lance/ (Lance backend, see PRD §Hard problems §2)
crates/lance-bench/ (criterion benchmarks specific to Lance)

§7. Acceptance: when is the rewrite done?

The Go Lakehouse reaches feature parity when:

All 12 Rust PRD invariants hold (object-storage source of truth, catalog metadata authority, idempotent ingest, hot-swap atomicity, profiles, etc.).
The 16 distillation acceptance gates pass (re-run ./scripts/distill audit-full against the Go pipeline).
The 22/22 acceptance fixtures from tests/fixtures/distillation/ acceptance/ pass under the Go implementation.
The 145 unit tests of distillation v1.0.0 are ported and pass.
devop.live/lakehouse/ demo cuts over to Go gateway with no visible UI regressions.
Auditor emits Kimi/Haiku/Opus verdicts on a test PR, matching the cross-lineage rotation behavior.
The 88 pathway traces replay with 11/11 prior successes reproduced.

At that point the Rust repo enters maintenance-only mode (security fixes), and the Go repo becomes the live system.

§8. Ratified — Phase G0 unblocked (2026-04-28, J)

#	Decision	Spec impact
1	DuckDB via cgo (`marcboeker/go-duckdb`)	§3.1 option A — proceed
2	HTMX + `html/template` + Alpine.js	§3.3 option A — proceed
3	`git.agentview.dev/profit/golangLAKEHOUSE`	repo location locked
4	Distillation rebuilt in Go (no bit-identical port)	§6 — port logic, not fixtures
5	Pathway memory starts empty; old traces noted	§3.4 G3.4.A is now "build initial state from scratch in Phase G3"; G3.4.B (byte-match) preserved as the porting correctness gate when the algorithm is reimplemented
6	Auditor longitudinal signal restarts	new `audit_baselines.jsonl` lineage starts on first Go-era PR

See docs/DECISIONS.md ADR-001 for full rationale and docs/RUST_PATHWAY_MEMORY_NOTE.md for where the legacy 88 traces live.

Phase G0 is now unblocked. Next step: bootstrap the Go module skeleton + push to Gitea, then begin §4 Phase G0 implementation.

34 KiB Raw Permalink Blame History Unescape Escape

SPEC: Lakehouse-Go Component Port Plan

§1. Component port table — Rust crates

§2. Component port table — TypeScript surfaces

§3. Hard problem details

§3.1 — Query engine (DuckDB via cgo)

§3.2 — HNSW index

§3.4 — Matrix indexer (corpus-as-shard composer)

§3.5 — Drift quantification (loop 5 of the PRD)

§3.6 — Staffing-side structured filter

§3.7 — Operational rating wiring

§3.8 — Observer-KB workflow runner (Archon-style multi-pass)

§3.9 — chatd (multi-provider LLM dispatcher) — SHIPPED 2026-04-30

§3.10 — local-review-harness (sibling tool, separate repo)

§3.3 — UI (HTMX)

§3.4 — Pathway memory port

§4. Phase plan

Phase G0 — Skeleton (Week 1–3)

Phase G1 — Vector + RAG (Week 4–6)

Phase G2 — Federation + profiles (Week 7–8)

Phase G3 — Pathway memory + distillation (Week 9–11)

Phase G4 — TS surfaces → Go (Week 12–14)

Phase G5 — UI + demo parity (Week 15–16)

§5. Repo layout

§6. Migration data plan

What ports verbatim

What rebuilds

What's archived

What's discarded

§7. Acceptance: when is the rewrite done?

§8. Ratified — Phase G0 unblocked (2026-04-28, J)

34 KiB

Raw Permalink Blame History