Per the 2026-04-29 scope-discipline pause: the wave shipped four
pieces beyond SPEC §3.4 component scope, and one architectural
pattern surfaced (Archon-style multi-pass workflow runner) that's
the observer's natural growth path. Document them as port targets
so the next scrum review has authoritative SPEC components.
§3.5 — Drift quantification (loop 5 of the PRD)
Names the SCORER drift work shipped in be65f85 + the deferred
shapes (PLAYBOOK drift, EMBEDDING drift, AUDIT BASELINE drift).
Acceptance gates G3.5.A–B.
§3.6 — Staffing-side structured filter
Names the metadata-filter MVP shipped in b199093 + the deferred
pre-retrieval SQL gate via queryd. Acceptance gates G3.6.A–C.
§3.7 — Operational rating wiring
Names the bulk playbook-record endpoint shipped in 6392772 + the
deferred UI shim, negative-feedback path, and time-decay.
Acceptance gates G3.7.A–B.
§3.8 — Observer-KB workflow runner (Archon-style multi-pass) —
PORT TARGET, not yet started
Documents the architecture J was working on across the Rust
observer-kb branch (10 commits ahead of main, never merged) and
the local Archon mod (committed 2026-04-29 as 3f2afc8 in
/home/profit/external/Archon, not pushed to coleam00/Archon).
The pattern: multi-pass mode chain (extract → validator →
hallucination → consensus → redteam → pipeline → render) where
each pass is a deterministic measurement. The observer is the
natural home — workflows ARE observation patterns whose every
step is recorded. Five components in dependency order: workflow
definition (YAML), node executor (DAG runner), provenance
recording (ObservedOps), mode catalog (matrix.search,
distillation.score, drift.scorer, llm.chat), HTTP surface
(/v1/observer/workflow/run).
Reference materials on the system (preserved, not lost):
- /home/profit/lakehouse/.archon/workflows/lakehouse-architect-review.yaml
(Rust main, 69919d9) — 3-node Archon-via-Lakehouse proof
- /home/profit/external/Archon dev branch — upstream engine
with local pi/provider.ts mod (3f2afc8) for Lakehouse routing
- Rust observer-kb branch — apps/observer-kb/docs/PRD.md +
Python prototypes proven on real ChatGPT/Claude PDF data
Acceptance gates G3.8.A–D. Estimated effort: L.
PRD updated with "Observer as system resource (clarified
2026-04-29)" section pointing at §3.8 as the architectural growth
path. The bare-bones observerd in bc9ab93 is the substrate; the
workflow runner is what makes it the "objective measurement engine"
the small-model pipeline needs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
643 lines
29 KiB
Markdown
643 lines
29 KiB
Markdown
# SPEC: Lakehouse-Go Component Port Plan
|
||
|
||
**Status:** DRAFT — companion to `PRD.md`. Component-by-component port
|
||
plan with library choices, effort estimates, and acceptance gates.
|
||
**Created:** 2026-04-28
|
||
**Owner:** J
|
||
|
||
This spec answers: for each piece of the Rust Lakehouse, what Go
|
||
library carries it, what the effort looks like, and what gate proves
|
||
the port is real.
|
||
|
||
Effort scale (one engineer-week = ~40h focused work):
|
||
- **S** — 1–3 days
|
||
- **M** — 1 engineer-week
|
||
- **L** — 2–3 engineer-weeks
|
||
- **XL** — 1+ months
|
||
- **HARD** — open research, see PRD §Hard problems
|
||
|
||
---
|
||
|
||
## §1. Component port table — Rust crates
|
||
|
||
| Crate | Rust deps that mattered | Go target | Library | Effort | Risk |
|
||
|---|---|---|---|---|---|
|
||
| `gateway` | axum, tokio, tonic, tower | `cmd/gateway` | `chi` + stdlib `net/http` + `google.golang.org/grpc` | **L** | low — Go's strongest domain |
|
||
| `catalogd` | parquet-rs, arrow, sqlite | `cmd/catalogd` | `apache/arrow-go/v18`, `mattn/go-sqlite3` | **L** | low |
|
||
| `storaged` | object_store, aws-sdk | `cmd/storaged` | `aws-sdk-go-v2`, `minio-go` for MinIO-specific paths | **M** | low |
|
||
| `queryd` | datafusion, arrow | `cmd/queryd` | **`duckdb/duckdb-go/v2`** (cgo, official) | **HARD** | high — see §3 |
|
||
| `ingestd` | csv, json, lopdf, postgres | `cmd/ingestd` | stdlib `encoding/csv`, `encoding/json`, `pdfcpu/pdfcpu`, `jackc/pgx/v5` | **L** | low |
|
||
| `vectord` | hora, arrow, hnsw | `cmd/vectord` | `coder/hnsw`, `apache/arrow-go/v18` | **L** | medium — re-validate HNSW recall |
|
||
| **matrix indexer** (emergent in Rust — `mode.rs` + `build_*_corpus.ts` + observer `/relevance`) | scripts/build_*_corpus.ts, crates/gateway/src/v1/mode.rs, mcp-server/observer.ts | `internal/matrix/` + gateway routes (`/v1/matrix/*`) | stdlib + vectord client | **L** | medium — see §3.4. Corpus-as-shard composer; relevance filter; strong-model downgrade gate; multi-corpus retrieve+merge. The learning-loop layer that lifts vectord from "static index" to "meta-index that learns from playbooks." |
|
||
| `vectord-lance` | lance | **DROPPED** | n/a | n/a | n/a — Parquet+HNSW only |
|
||
| `journald` | parquet, arrow | `cmd/journald` | `apache/arrow-go/v18` | **M** | low |
|
||
| `aibridge` | reqwest | library | `net/http` + connection pool · `anthropics/anthropic-sdk-go` available for direct Claude calls (currently routed via opencode) | **S** | low |
|
||
| `validator` | parquet, custom | library | `apache/arrow-go/v18` parquet reader | **M** | low — port the 24 unit tests as gates |
|
||
| `truth` | tomli, custom DSL | library | `pelletier/go-toml/v2` | **M** | low |
|
||
| `proto` | tonic-build | `proto/` + `protoc-gen-go` | `buf` + `protoc-gen-go-grpc` | **S** | low |
|
||
| `shared` | serde, anyhow | library | stdlib `encoding/json`, `errors` | **S** | low |
|
||
| `ui` | dioxus, wasm | **REPLACED** | `html/template` + HTMX | **L** | medium — see §3 |
|
||
| `lance-bench` | criterion | n/a — dropped with Lance | n/a | n/a | n/a |
|
||
|
||
**Total Rust crate port effort:** ~12–18 engineer-weeks (3–4 months for
|
||
one engineer; 6–8 weeks for two).
|
||
|
||
---
|
||
|
||
## §2. Component port table — TypeScript surfaces
|
||
|
||
| TS surface | Current location | Go target | Library | Effort | Risk |
|
||
|---|---|---|---|---|---|
|
||
| `mcp-server/index.ts` | Bun, :3700 | `cmd/mcp` | **`modelcontextprotocol/go-sdk`** (official Go SDK, v1.5.0, Google-collab) | **L** | medium — MCP semantics |
|
||
| `mcp-server/observer.ts` | Bun, :3800 | `cmd/observer` | stdlib `net/http`, `slog` | **M** | low |
|
||
| `mcp-server/tracing.ts` | Bun, Langfuse client | library | `go.opentelemetry.io/otel` + Langfuse Go client (or hand-roll) | **M** | low — Langfuse Go OSS support varies |
|
||
| `auditor/*.ts` | TS, runs as systemd | `cmd/auditor` | stdlib + `gitea API client` | **L** | medium — auditor cross-lineage logic is intricate |
|
||
| `tests/real-world/scrum_master_pipeline.ts` | TS, ad-hoc | `cmd/scrum` | stdlib | **L** | medium — chunking + embed + ladder logic |
|
||
| `tests/real-world/scrum_applier.ts` | TS, ad-hoc | `cmd/scrum-apply` | stdlib + git CLI shell-out | **M** | medium |
|
||
| `bot/propose.ts` | TS | `cmd/bot` | stdlib | **S** | low |
|
||
| Search demo HTML/JS | static | static (no port) | n/a | n/a | n/a — copied as-is |
|
||
|
||
**Total TS port effort:** ~6–10 engineer-weeks.
|
||
|
||
---
|
||
|
||
## §3. Hard problem details
|
||
|
||
### §3.1 — Query engine (DuckDB via cgo)
|
||
|
||
**Library:** `github.com/duckdb/duckdb-go/v2` — official Go bindings via
|
||
cgo. (Replaces the legacy `marcboeker/go-duckdb`, which was deprecated
|
||
when the DuckDB team and Marc Boeker jointly relocated maintenance to
|
||
the DuckDB org at v2.5.0. Migration is a one-line `gofmt -r` rewrite of
|
||
import paths.) Current version v2.10502.0 (April 2026), DuckDB v1.5.2
|
||
compat. Statically links default extensions: ICU, JSON, Parquet,
|
||
Autocomplete.
|
||
|
||
**API shape** (replaces the DataFusion `SessionContext` pattern):
|
||
```go
|
||
db, _ := sql.Open("duckdb", "")
|
||
defer db.Close()
|
||
db.Exec("CREATE VIEW workers AS SELECT * FROM read_parquet('s3://bucket/workers/*.parquet')")
|
||
rows, _ := db.Query("SELECT role, count(*) FROM workers WHERE state='IL' GROUP BY role")
|
||
```
|
||
|
||
**Acceptance gates:**
|
||
- G3.1.A — `SELECT * FROM read_parquet('workers_500k.parquet') LIMIT 1`
|
||
returns a row with the expected schema. Establishes Parquet read
|
||
works.
|
||
- G3.1.B — Hybrid SQL+vector query (the `POST /vectors/hybrid`
|
||
surface) returns same workers as the Rust path on the same input,
|
||
ranked the same way modulo embedding precision.
|
||
- G3.1.C — Hot-cache merge-on-read: register a base table + a delta
|
||
Parquet, query, observe both rows merged with the delta winning on
|
||
conflict.
|
||
|
||
**Fallback if cgo is rejected:** run DuckDB as an external process
|
||
(`duckdb -json -c '...'` shelled or HTTP via a thin Go wrapper). Adds
|
||
operational surface; preserves SQL model.
|
||
|
||
### §3.2 — HNSW index
|
||
|
||
**Library:** `coder/hnsw` — pure-Go HNSW, in-process. Supports add /
|
||
delete / search / persist.
|
||
|
||
**Open question:** does `coder/hnsw` match the recall@10 we measured
|
||
on the Rust `hora` path? Need a calibration test:
|
||
- Rebuild `lakehouse_arch_v1` (the 1086-chunk arch corpus) in Go.
|
||
- Compare recall@10 on a fixed query set to the Rust baseline.
|
||
- Acceptance: ≤2% drop or we switch library / parameters.
|
||
|
||
**Persistence format:** TBD — `coder/hnsw` has its own snapshot format;
|
||
ADR equivalent of ADR-008 (Parquet for embeddings + sidecar HNSW file)
|
||
needs revisiting in Go to confirm the sidecar format we ship.
|
||
|
||
**Acceptance gates:**
|
||
- G3.2.A — Build HNSW from a Parquet of 100K vectors in <60s
|
||
- G3.2.B — Search 100K vectors at k=10 in <50ms p50
|
||
- G3.2.C — Recall@10 within 2% of Rust baseline on
|
||
`lakehouse_arch_v1`
|
||
|
||
### §3.4 — Matrix indexer (corpus-as-shard composer)
|
||
|
||
**What it is.** The matrix indexer is the layer above `vectord` that
|
||
turns a fleet of single-corpus HNSW indexes into a learning meta-index.
|
||
In the Rust system this is emergent — split between corpus builders
|
||
(`scripts/build_*_corpus.ts`), the mode runner (`crates/gateway/src/v1/mode.rs`),
|
||
the observer relevance endpoint (`mcp-server/observer.ts`), and the
|
||
strong-model downgrade gate (`mode.rs::execute`). In Go we name it
|
||
explicitly so future sessions don't reduce it to "vectord."
|
||
|
||
**Why corpus-as-shard, not shard-by-id.** Sharding a single index by
|
||
hash(id) is a pure throughput hack with a recall tax. Sharding by
|
||
corpus is the existing retrieval shape — `lakehouse_arch_v1`,
|
||
`lakehouse_symbols_v1`, `scrum_findings_v1`, `lakehouse_answers_v1`,
|
||
`kb_team_runs_v1`, `successful_playbooks_live`, etc. — each with
|
||
distinct topology and a distinct retrieval intent. Concurrent Adds
|
||
parallelize naturally because they go to different corpora; the
|
||
matrix layer's job is to retrieve+merge across them, filter for
|
||
relevance, and downgrade composition when strong models prove the
|
||
matrix is anti-additive.
|
||
|
||
**Components to port (in dependency order):**
|
||
|
||
1. **Corpus builders** — Go equivalents of `scripts/build_*_corpus.ts`.
|
||
For each named corpus, a builder that reads source, splits into
|
||
chunks per the corpus's schema, embeds via `/v1/embed`, and adds
|
||
to a vectord index of the same name. Effort: **M** for the first
|
||
builder, **S** for each subsequent.
|
||
|
||
2. **Multi-corpus retrieve+merge** (`internal/matrix/retrieve.go`) —
|
||
given a query and a list of corpus names, search each at top_k=K,
|
||
merge by score, return top N globally. Match Rust's pattern:
|
||
top_k=6 per corpus, top 8 globally before relevance filter.
|
||
|
||
3. **Relevance filter** (`internal/matrix/relevance.go`) — port the
|
||
threshold-based filter from `mcp-server/observer.ts:/relevance`.
|
||
Drops adjacency-pollution chunks that share a corpus with the hit
|
||
but aren't actually about the query. `LH_RELEVANCE_FILTER` /
|
||
`LH_RELEVANCE_THRESHOLD` env knobs preserved.
|
||
|
||
4. **Strong-model downgrade gate** (`internal/matrix/downgrade.go`) —
|
||
port `is_weak_model` + the `codereview_lakehouse → codereview_isolation`
|
||
flip from `mode.rs::execute`. Pass5 proved composed corpora lose
|
||
5/5 vs isolation on grok-4.1-fast (p=0.031); the gate is
|
||
load-bearing for paid-model retrieval quality.
|
||
|
||
5. **Learning-loop integration** — write outcomes back to a
|
||
playbook-memory corpus (probably `lakehouse_answers_v1` analogue).
|
||
This is what makes the matrix INDEX a learning system rather than
|
||
static retrieval. Per `feedback_meta_index_vision.md`: this is the
|
||
north star, not the data structure.
|
||
|
||
**Gateway routes:** `/v1/matrix/search` (multi-corpus retrieve+merge),
|
||
`/v1/matrix/corpora` (list + metadata), `/v1/matrix/relevance` (filter
|
||
endpoint, used by both internal callers and external tooling).
|
||
|
||
**Acceptance gates:**
|
||
- G3.4.A — `/v1/matrix/search` against ≥3 corpora returns merged top-N
|
||
with corpus attribution per result.
|
||
- G3.4.B — Relevance filter drops at least the threshold-margin chunks
|
||
on a known adjacency-pollution test case.
|
||
- G3.4.C — Strong-model downgrade gate flips composed→isolation when
|
||
the model is non-weak; bypassed when caller sets `force_mode`.
|
||
- G3.4.D — Concurrent Adds across N=4 corpora parallelize (no shared
|
||
write-lock); Add throughput scales near-linearly with corpus count.
|
||
|
||
**Persistence:** each corpus's vectord index persists via the existing
|
||
G1P LHV1 format. The matrix layer is stateless above that — corpus
|
||
list lives in catalog, retrieval params in config.
|
||
|
||
**Why this is its own §3.x:** in Rust the matrix indexer was emergent
|
||
and got reduced to "we have vectord" in earlier port-planning. The
|
||
SPEC names it explicitly so the port preserves the multi-corpus
|
||
retrieval shape AND the learning loop, not just the HNSW substrate.
|
||
|
||
### §3.5 — Drift quantification (loop 5 of the PRD)
|
||
|
||
**What it is.** PRD names "drift" as the 5th loop: quantify when
|
||
historical decisions stop matching current reality. Distinct from
|
||
the rating+distillation loop because drift is MEASUREMENT, not
|
||
LEARNING. The learning loop says "this match worked, remember it";
|
||
the drift loop says "this 4-month-old playbook entry — does it
|
||
still match what the substrate would surface today?"
|
||
|
||
**What's shipped (commit `be65f85`):**
|
||
- SCORER drift: re-runs current `distillation.ScoreRecord` over
|
||
historical (EvidenceRecord, persisted_category) pairs and
|
||
reports mismatches + a sorted shift matrix
|
||
- `internal/drift/drift.go` — pure-function `ComputeScorerDrift`
|
||
- 6 unit tests covering no-drift, shift detection, multi-shift
|
||
sorted-by-count, includeEntries flag, empty input, scorer-version
|
||
stamping
|
||
|
||
**Future drift shapes (not shipped):**
|
||
- PLAYBOOK drift: re-run playbook queries through current
|
||
matrix-search; recorded answer not in top-K = drift
|
||
- EMBEDDING drift: KS-test on vector distribution at T1 vs T2
|
||
- AUDIT BASELINE drift: matches Rust `audit_baselines.jsonl`
|
||
longitudinal signal
|
||
|
||
**Acceptance gates:**
|
||
- G3.5.A — A scorer-version bump triggers a non-zero `Drifted` count
|
||
on a corpus of historical ScoredRuns where the new logic produces
|
||
different categories than the persisted ones.
|
||
- G3.5.B — `ScorerDriftReport.ShiftMatrix` is deterministic-ordered
|
||
(count desc, ties broken alphabetically) so JSON output is stable
|
||
across runs.
|
||
|
||
### §3.6 — Staffing-side structured filter
|
||
|
||
**What it is.** Reality tests on the candidates + workers corpora
|
||
(commits `0d1553c`, `a97881d`) surfaced that pure semantic retrieval
|
||
can't gate by location/status/availability — the matrix indexer
|
||
returns Production Workers for a Forklift+OSHA-30 query because
|
||
nomic-embed-text's geometry doesn't separate the role labels well.
|
||
Structured filtering is the addressable piece: pre-filter the
|
||
candidate set on metadata fields BEFORE semantic ranking.
|
||
|
||
**What's shipped (commit `b199093`):**
|
||
- `SearchRequest.MetadataFilter` — `map[string]any` of metadata
|
||
field → expected value (single value or list-of-values for OR
|
||
semantics within a key, AND across keys)
|
||
- Post-retrieval filter applied before top-K truncation in
|
||
`internal/matrix/retrieve.go`
|
||
- `SearchResponse.MetadataFilterDropped` for telemetry on filter
|
||
aggressiveness
|
||
- 7 unit tests covering nil filter, missing metadata, exact match,
|
||
AND across keys, OR within list, bool match, malformed JSON
|
||
|
||
**Deferred:**
|
||
- Pre-retrieval SQL gate via `queryd` (the actual hybrid). The
|
||
post-retrieval filter is an MVP that helps when the candidate
|
||
set is mostly relevant; for aggressive filters that drop most
|
||
results, a SQL pre-filter into matrix retrieval would surface
|
||
the right candidates with less wasted embedding work.
|
||
- Filter language richer than equality (e.g. range, prefix, regex).
|
||
|
||
**Acceptance gates:**
|
||
- G3.6.A — `MetadataFilter: {"state": "IL"}` against a mixed-state
|
||
corpus drops every non-IL result; `MetadataFilterDropped` reports
|
||
the count.
|
||
- G3.6.B — List filter `{"state": ["IL", "WI"]}` keeps both states,
|
||
drops the rest (OR within key).
|
||
- G3.6.C — Multi-key filter is AND: a result missing any key is
|
||
dropped, no exception.
|
||
|
||
### §3.7 — Operational rating wiring
|
||
|
||
**What it is.** PRD loop 4 (rating + distillation) needs real
|
||
inflows to be a learning system rather than a substrate. The
|
||
playbook-record endpoint (`06e7152`) takes one (query, answer,
|
||
score) per call; productizing it into actual signal sources is what
|
||
makes the system get smarter with use.
|
||
|
||
**What's shipped (commit `6392772`):**
|
||
- `POST /v1/matrix/playbooks/bulk` — bulk-record N successes;
|
||
per-entry success/failure response so callers can see which of
|
||
a 4,701-row historical placement import succeeded vs which
|
||
failed validation.
|
||
- Single-record path from `06e7152` unchanged.
|
||
|
||
**Deferred:**
|
||
- UI shim for click-tracking (no Go demo UI yet — the Bun demo at
|
||
`devop.live/lakehouse/` is still serving the public surface).
|
||
When the Go UI lands or a feedback API is added to the Bun UI,
|
||
every coordinator click → bulk-batched POST → playbook entry.
|
||
- Negative feedback (this match didn't work). Currently only
|
||
positive scores are recorded; a rejection signal would help the
|
||
learning loop avoid pushing bad matches.
|
||
- Time-decay on playbook scores so stale recommendations attenuate.
|
||
|
||
**Acceptance gates:**
|
||
- G3.7.A — Bulk POST of N entries returns `{recorded, failed,
|
||
results[]}` with per-entry IDs/errors, no single-entry failure
|
||
aborting the batch.
|
||
- G3.7.B — Each recorded entry surfaces in `/v1/matrix/search` with
|
||
`use_playbook=true` after a re-query.
|
||
|
||
### §3.8 — Observer-KB workflow runner (Archon-style multi-pass)
|
||
|
||
**What it is.** The architectural pattern documented in the Rust
|
||
`observer-kb` branch (10 commits ahead of main, never merged) and
|
||
proven by `/home/profit/external/Archon`'s workflow engine. Multiple
|
||
mode passes processing data, with each pass an objective measurement
|
||
that contributes to the KB:
|
||
|
||
```
|
||
Raw data
|
||
↓ Mode: EXTRACT structured facts/entities/relationships
|
||
↓ Mode: VALIDATOR fact-check, confidence 1-10
|
||
↓ Mode: HALLUCINATION verify each claim, flag likely fabrications
|
||
↓ Mode: CONSENSUS multiple passes until extraction converges
|
||
↓ Mode: REDTEAM attack what survived, patch what fails
|
||
↓ Mode: PIPELINE clean → Q&A structure → topic group → rank
|
||
↓ RENDER curated doc anchored on questions
|
||
```
|
||
|
||
This is the *orchestrator* missing from §3.4 components 1-5: each
|
||
SPEC §3.4 piece (relevance, downgrade, scorer, drift) is a "mode";
|
||
what's missing is the workflow engine that chains them.
|
||
|
||
**Why it matters.** Per the PRD's product vision: the observer
|
||
should make actionable decisions based on watching what's
|
||
successful. The workflow runner is how observers compose modes
|
||
into multi-pass pipelines that score outcomes rigorously enough
|
||
to feed the KB and inform the playbook substrate.
|
||
|
||
**Reference materials on the system:**
|
||
- `/home/profit/lakehouse/.archon/workflows/lakehouse-architect-review.yaml`
|
||
(committed `69919d9` in main) — proves Archon-via-Lakehouse
|
||
works with a 3-node `shape → weakness → improvement` workflow
|
||
- `/home/profit/external/Archon` — the upstream workflow engine
|
||
(cloned 2026-04-26); `packages/providers/src/community/pi/provider.ts`
|
||
has the local Lakehouse-routing mod committed locally as
|
||
`3f2afc8` (not pushed to upstream `coleam00/Archon`)
|
||
- Rust `observer-kb` branch (10 commits, +4338/-55506 LoC) —
|
||
`apps/observer-kb/docs/PRD.md` documents the multi-pass
|
||
architecture; `scripts/{deep_analysis,extract_knowledge,process_knowledge}.py`
|
||
are the Python prototypes that proved it on real ChatGPT/Claude
|
||
PDF data (496 topics, 300 decisions, 100 insights extracted)
|
||
|
||
**Components to port (in dependency order):**
|
||
|
||
1. **Workflow definition** (`internal/workflow/types.go`) — YAML
|
||
schema matching Archon's shape: `name`, `description`, `provider`,
|
||
`model`, list of `nodes` each with `id`, `prompt`, `allowed_tools`,
|
||
`effort`, `idle_timeout`, `depends_on`. The depends_on edges form
|
||
a DAG; the runner resolves topologically.
|
||
|
||
2. **Node executor** (`internal/workflow/runner.go`) — given a
|
||
workflow and a starting context, walks the DAG, executes each
|
||
node by dispatching to the configured backend (matrix.Search,
|
||
distillation.ScoreRecord, drift.ComputeScorerDrift, or a generic
|
||
prompt-against-LLM via gateway `/v1/chat`), captures per-node
|
||
output, makes it available as `$<node_id>.output` in subsequent
|
||
nodes.
|
||
|
||
3. **Provenance recording** — every node execution lands an
|
||
ObservedOp (via the observerd substrate from `bc9ab93`) with
|
||
`source: "workflow"`, the workflow name + node ID, input/output
|
||
summaries, and timing. The ring buffer + JSONL log become the
|
||
substrate for the rating+distillation loop's KB feed.
|
||
|
||
4. **Mode catalog** (`internal/workflow/modes.go`) — registry of
|
||
the modes the runner can dispatch to. Each mode is a Go function
|
||
matching a uniform `func(ctx, input map[string]any) (map[string]any, error)`
|
||
signature so workflows can compose them. Initial modes from
|
||
§3.4: `matrix.search`, `matrix.relevance`, `matrix.downgrade`,
|
||
`playbook.record`, `playbook.lookup`, `distillation.score`,
|
||
`drift.scorer`. Plus `llm.chat` for free-form mode prompts.
|
||
|
||
5. **HTTP surface** — `POST /v1/observer/workflow/run` accepts a
|
||
workflow YAML body + a starting context; returns the per-node
|
||
results + the chain of ObservedOps generated. `GET
|
||
/v1/observer/workflow/list` lists workflows in a known directory
|
||
for operator discoverability.
|
||
|
||
**Why integrate into observerd, not a new service.** The observer
|
||
is the system resource that watches and records. Workflows ARE
|
||
observation patterns — multi-step processes whose every step is
|
||
recorded. Putting the runner inside observerd keeps the
|
||
"measurement → KB feed" wiring tight; a separate service would
|
||
re-implement the recording layer.
|
||
|
||
**Acceptance gates:**
|
||
- G3.8.A — Load a workflow YAML matching the Archon `lakehouse-architect-review.yaml`
|
||
shape; runner executes the 3-node DAG topologically.
|
||
- G3.8.B — Each node execution lands an ObservedOp with
|
||
`source: "workflow"` and the node's input/output. Stats endpoint
|
||
shows the workflow ops.
|
||
- G3.8.C — A node referencing `$<prior_node>.output` in its prompt
|
||
resolves correctly; missing reference is a clear error not a
|
||
silent empty string.
|
||
- G3.8.D — Mode catalog dispatches `matrix.search` invocation to
|
||
the matrixd backend without going through HTTP (in-process
|
||
function call when matrixd is co-resident).
|
||
|
||
**Status:** PORT TARGET, not yet started. SPEC commits the design;
|
||
implementation is its own wave (estimated **L** effort given the
|
||
DAG runner + mode dispatch + provenance recording).
|
||
|
||
### §3.3 — UI (HTMX)
|
||
|
||
**Approach:** server-rendered Go templates using `html/template`,
|
||
HTMX for partial-page swaps, Alpine.js for client-side interactivity
|
||
where needed. Single binary serves API + UI.
|
||
|
||
**Acceptance gates:**
|
||
- G3.3.A — `Ask` tab: type natural-language question, get answer
|
||
from RAG endpoint, render in-page without full reload
|
||
- G3.3.B — `Explore` tab: paginated dataset list with hot-swap badge
|
||
rendering
|
||
- G3.3.C — `SQL` tab: textarea → submit → tabular result rendered
|
||
in-page
|
||
- G3.3.D — `System` tab: live tail of `/storage/errors` and
|
||
`/hnsw/trials` via HTMX polling
|
||
|
||
**Fallback if HTMX feels limiting:** split repo `golangLAKEHOUSE-ui`
|
||
with Vite + React, served as static files by Go gateway. Costs an
|
||
extra repo + build chain.
|
||
|
||
### §3.4 — Pathway memory port
|
||
|
||
**Constraint:** the Rust `pathway_memory` and TS implementations were
|
||
byte-matching by ADR-021. The byte contract was verified by running
|
||
both implementations on the same input tokens and asserting matching
|
||
bucket indices.
|
||
|
||
**Go port plan:**
|
||
- Port the 32-bucket SHA256-keyed token hash exactly. Verify on a
|
||
golden input that Go produces the same bucket vector as Rust.
|
||
- Port the JSON state file format verbatim — the existing 88 traces in
|
||
`data/_pathway_memory/state.json` reload as-is into the Go
|
||
implementation.
|
||
- Port the matrix-correctness layer (ADR-021's `SemanticFlag`,
|
||
`BugFingerprint`, `TypeHint`) — these are pure value types,
|
||
trivially portable.
|
||
|
||
**Acceptance gates:**
|
||
- G3.4.A — Load existing `state.json`, run `replay` on the same 11
|
||
prior successful pathways, all 11 succeed (matching the Rust 11/11
|
||
baseline).
|
||
- G3.4.B — Bucket vector for a fixed test input byte-matches the
|
||
Rust output.
|
||
|
||
---
|
||
|
||
## §4. Phase plan
|
||
|
||
### Phase G0 — Skeleton (Week 1–3)
|
||
|
||
**Scope:** smallest end-to-end ingest + query path working in Go.
|
||
|
||
| Component | Deliverable |
|
||
|---|---|
|
||
| `cmd/gateway` | HTTP on :3100, `/health`, `/v1/chat` proxy stub |
|
||
| `cmd/catalogd` | In-memory registry + Parquet manifest persistence |
|
||
| `cmd/storaged` | Single-bucket S3 / local FS, no error journal yet |
|
||
| `cmd/ingestd` | CSV → Parquet, schema inference, register-on-ingest |
|
||
| `cmd/queryd` | DuckDB-backed `POST /sql` endpoint |
|
||
|
||
**Acceptance:** upload a CSV via `POST /ingest`, query it via
|
||
`POST /sql` with a SELECT, get rows back. Single-bucket. No vector,
|
||
no profile, no UI.
|
||
|
||
### Phase G1 — Vector + RAG (Week 4–6)
|
||
|
||
| Component | Deliverable |
|
||
|---|---|
|
||
| `cmd/vectord` | Embed-on-ingest (calls Python sidecar), HNSW build, `POST /search` |
|
||
| `cmd/gateway` | Add `POST /rag` (embed → search → retrieve → generate via aibridge) |
|
||
| `cmd/aibridge` | HTTP client to existing Python sidecar |
|
||
|
||
**Acceptance:** ingest 15K resumes (the original Phase 7 fixture),
|
||
ask "find me a forklift operator with OSHA-10 in IL", get ranked
|
||
results with LLM-generated explanation grounded in the retrieved
|
||
chunks.
|
||
|
||
### Phase G2 — Federation + profiles (Week 7–8)
|
||
|
||
| Component | Deliverable |
|
||
|---|---|
|
||
| `cmd/storaged` | Multi-bucket registry, rescue bucket, error journal at `primary://_errors/` |
|
||
| Profile system | Per-reader profile bound to bucket + vector index |
|
||
| Hot-swap | Atomic pointer swap for index generations |
|
||
|
||
**Acceptance:** two profiles bound to two buckets, queries scoped
|
||
correctly, hot-swap a vector index without query interruption,
|
||
rollback works.
|
||
|
||
### Phase G3 — Pathway memory + distillation (Week 9–11)
|
||
|
||
| Component | Deliverable |
|
||
|---|---|
|
||
| `cmd/vectord` | Pathway memory module ported, 88 traces reloaded |
|
||
| Distillation pipeline | SFT export, contamination firewall, scorer |
|
||
| Audit baselines | `audit_baselines.jsonl` longitudinal signal port |
|
||
|
||
**Acceptance:** replay 11 prior successful pathways, all 11 succeed.
|
||
Re-run distillation acceptance on the frozen fixture set, 22/22 pass.
|
||
|
||
### Phase G4 — TS surfaces → Go (Week 12–14)
|
||
|
||
| Component | Deliverable |
|
||
|---|---|
|
||
| `cmd/mcp` | MCP server (replaces Bun) — `/v1/chat`, intelligence endpoints |
|
||
| `cmd/observer` | Autonomous iteration loop, op recording |
|
||
| `cmd/auditor` | PR audit pipeline (kimi/haiku/opus rotation) |
|
||
| `cmd/scrum` | Scrum master pipeline (replaces TS) |
|
||
|
||
**Acceptance:** open a test PR, auditor cycles within 90s, emits
|
||
verdict to `data/_auditor/kimi_verdicts/`, behavior matches Rust+TS
|
||
era within tolerance.
|
||
|
||
### Phase G5 — UI + demo parity (Week 15–16)
|
||
|
||
| Component | Deliverable |
|
||
|---|---|
|
||
| `cmd/gateway` | Serves HTMX templates + static demo HTML |
|
||
| Demo at `devop.live/lakehouse/` | Parity with current Bun demo |
|
||
| Staffer console at `/console` | Parity |
|
||
|
||
**Acceptance:** `devop.live/lakehouse/` cuts over from Bun to Go
|
||
gateway. Section ① / ② / ③ all render. Compact contract cards still
|
||
expand with Project Index. Fill-probability bars still paint.
|
||
|
||
---
|
||
|
||
## §5. Repo layout
|
||
|
||
```
|
||
golangLAKEHOUSE/
|
||
├── docs/
|
||
│ ├── PRD.md ← this PRD
|
||
│ ├── SPEC.md ← this spec
|
||
│ ├── DECISIONS.md ← Go-era ADRs (start fresh, reference Rust ADRs by number)
|
||
│ └── ADR-XXX-*.md ← per-ADR detail
|
||
├── cmd/
|
||
│ ├── gateway/ ← main HTTP/gRPC ingress
|
||
│ ├── catalogd/
|
||
│ ├── storaged/
|
||
│ ├── queryd/
|
||
│ ├── ingestd/
|
||
│ ├── vectord/
|
||
│ ├── journald/
|
||
│ ├── mcp/
|
||
│ ├── observer/
|
||
│ ├── auditor/
|
||
│ └── scrum/
|
||
├── internal/ ← shared packages, not exported
|
||
│ ├── aibridge/
|
||
│ ├── validator/
|
||
│ ├── truth/
|
||
│ ├── shared/
|
||
│ ├── proto/ ← generated protobuf
|
||
│ └── pathway/
|
||
├── pkg/ ← public Go packages (none initially)
|
||
├── web/ ← UI (HTMX templates + static)
|
||
│ ├── templates/
|
||
│ └── static/
|
||
├── scripts/ ← cold-start, smoke, distill scripts
|
||
├── tests/ ← golden files, integration tests
|
||
├── go.mod
|
||
├── go.sum
|
||
└── README.md
|
||
```
|
||
|
||
**Single Go module.** All commands and internal packages live under
|
||
`golangLAKEHOUSE/`. No nested modules unless a package needs an
|
||
independent release cadence (none expected).
|
||
|
||
**Build:** `go build ./cmd/...` produces all binaries.
|
||
|
||
---
|
||
|
||
## §6. Migration data plan
|
||
|
||
### What ports verbatim
|
||
- Parquet datasets at `data/datasets/*.parquet` — read by Go directly.
|
||
- Catalog manifests — Parquet, ports as data not code.
|
||
- Pathway memory state — JSON, ports if §3.4 byte-matching gate passes.
|
||
|
||
### What rebuilds
|
||
- HNSW indexes — rebuild from Parquet embeddings on first Go startup.
|
||
- Auditor verdicts on PRs — old PRs won't be re-audited; lineage starts
|
||
fresh on the new repo's PRs.
|
||
|
||
### What's archived
|
||
- The Rust `crates/` tree — preserved in the original repo at the
|
||
cutover commit, tagged `pre-go-rewrite-2026-04-28` for reference.
|
||
- TS surfaces (`mcp-server/`, `auditor/`, etc.) — preserved in the
|
||
original repo at the same tag.
|
||
- Distillation v1.0.0 substrate (`tag distillation-v1.0.0`,
|
||
`e7636f2`) — kept as the historical reference; Go re-implementation
|
||
ports the LOGIC but not the bit-identical-reproducibility property
|
||
unless an ADR re-establishes it.
|
||
|
||
### What's discarded
|
||
- `crates/vectord-lance/` (Lance backend, see PRD §Hard problems §2)
|
||
- `crates/lance-bench/` (criterion benchmarks specific to Lance)
|
||
|
||
---
|
||
|
||
## §7. Acceptance: when is the rewrite done?
|
||
|
||
The Go Lakehouse reaches **feature parity** when:
|
||
|
||
1. **All 12 Rust PRD invariants hold** (object-storage source of truth,
|
||
catalog metadata authority, idempotent ingest, hot-swap atomicity,
|
||
profiles, etc.).
|
||
2. **The 16 distillation acceptance gates pass** (re-run
|
||
`./scripts/distill audit-full` against the Go pipeline).
|
||
3. **The 22/22 acceptance fixtures from `tests/fixtures/distillation/
|
||
acceptance/` pass** under the Go implementation.
|
||
4. **The 145 unit tests of distillation v1.0.0 are ported and pass.**
|
||
5. **`devop.live/lakehouse/` demo cuts over to Go gateway** with no
|
||
visible UI regressions.
|
||
6. **Auditor emits Kimi/Haiku/Opus verdicts** on a test PR, matching
|
||
the cross-lineage rotation behavior.
|
||
7. **The 88 pathway traces replay** with 11/11 prior successes
|
||
reproduced.
|
||
|
||
At that point the Rust repo enters maintenance-only mode (security
|
||
fixes), and the Go repo becomes the live system.
|
||
|
||
---
|
||
|
||
## §8. Ratified — Phase G0 unblocked (2026-04-28, J)
|
||
|
||
| # | Decision | Spec impact |
|
||
|---|---|---|
|
||
| 1 | DuckDB via cgo (`marcboeker/go-duckdb`) | §3.1 option A — proceed |
|
||
| 2 | HTMX + `html/template` + Alpine.js | §3.3 option A — proceed |
|
||
| 3 | `git.agentview.dev/profit/golangLAKEHOUSE` | repo location locked |
|
||
| 4 | Distillation rebuilt in Go (no bit-identical port) | §6 — port logic, not fixtures |
|
||
| 5 | Pathway memory starts empty; old traces noted | §3.4 G3.4.A is now "build initial state from scratch in Phase G3"; G3.4.B (byte-match) preserved as the porting correctness gate when the algorithm is reimplemented |
|
||
| 6 | Auditor longitudinal signal restarts | new `audit_baselines.jsonl` lineage starts on first Go-era PR |
|
||
|
||
See `docs/DECISIONS.md` ADR-001 for full rationale and
|
||
`docs/RUST_PATHWAY_MEMORY_NOTE.md` for where the legacy 88 traces live.
|
||
|
||
**Phase G0 is now unblocked.** Next step: bootstrap the Go module
|
||
skeleton + push to Gitea, then begin §4 Phase G0 implementation.
|