Adds the "verified working RIGHT NOW / DO NOT RELITIGATE / OPEN" anchor at the repo root, mirroring /home/profit/lakehouse/STATE_OF_PLAY.md. Memory files (project_golang_lakehouse.md) supplement; this file is the verified-truth pointer. Sections: - VERIFIED WORKING: 13 cmd binaries + 18 smokes + 5 matrix components + Mem0 pathway + observerd + workflow runner + chatd 5-provider dispatcher + model tier registry. just verify PASS in 31s. - DO NOT RELITIGATE: 4 ratified ADRs (DECISIONS.md ADR-001..004) + today's scrum dispositions (B-1..B-4 fixed, FP-A1/A2/C1 dismissed) + session frame items (Rust legacy is maintenance-only, etc.). - OPEN: reality test held on J's queries, 3 daemon main_test.go gap, Sprint 4 deployment, ADR-005 observer fail-safe, ADR-006 auth posture. - RECENT WAVE: 6-commit table 05273ac..e4ee002 documenting today's 4 phases + scrum + tooling. - RUNTIME CHEATSHEET: just verify, chatd boot, /v1/chat/providers probe, scrum_review.sh usage. - VISION: 5-loop substrate gate from project_small_model_pipeline_vision.md. The read-mem skill (in /root/.claude/skills/read-mem/) and project memory file are updated to reference this file as the primary Go anchor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
13 KiB
STATE OF PLAY — Lakehouse-Go
Last verified: 2026-04-30 ~01:00 CDT
Verified by: live probes + just verify PASS, not memory.
Read this FIRST. When the user says "we're working on lakehouse," default to the Go rewrite (this repo); the Rust legacy at
/home/profit/lakehouse/is maintenance-only. If memory contradicts this file, this file wins. Update it when something is verified working — not when a phase finishes.
VERIFIED WORKING RIGHT NOW
Substrate (G0 + G1 family)
13 service binaries under cmd/ plus 2 driver scripts under scripts/staffing_* build into bin/. 18 smoke scripts all PASS. just verify (vet + 30 packages × short tests + 9 core smokes) green in ~31s wall.
| Binary | Port | What |
|---|---|---|
gateway |
3110 | reverse proxy, single OpenAI-compat-style edge |
storaged |
3211 | S3 GET/PUT/LIST/DELETE w/ per-prefix PUT cap (ADR-002) |
catalogd |
3212 | Parquet manifests, ADR-020 idempotent register |
ingestd |
3213 | CSV → Parquet → catalogd, content-addressed keys |
queryd |
3214 | DuckDB SELECT over Parquet via httpfs |
vectord |
3215 | HNSW indexes (coder/hnsw), persistence to storaged |
embedd |
3216 | Ollama-backed embedder w/ LRU cache |
pathwayd |
3217 | Mem0 ops (Add/Update/Revise/Retire/History/Search) |
matrixd |
3218 | Multi-corpus retrieve+merge + relevance + downgrade + playbook |
observerd |
3219 | Witness loop, workflow runner with DAG executor |
chatd |
3220 | LLM dispatcher: ollama / ollama_cloud / openrouter / opencode / kimi |
mcpd |
— | MCP SDK port (Bun mcp-server replacement) |
fake_ollama |
— | Test fixture (used by g2_smoke_fixtures.sh) |
Matrix indexer — all 5 SPEC §3.4 components shipped
- Corpus builders (
internal/corpusingest) - Multi-corpus retrieve+merge (
matrixd /matrix/search) - Relevance filter (
internal/matrix/relevance.go376 LoC + 289 LoC test) - Strong-model downgrade gate (
internal/matrix/downgrade.go, readscfg.Models.WeakModelsafter Phase 2) - Playbook memory + boost (
internal/matrix/playbook.go, learning loop)
Pathway memory (Mem0 substrate)
Full ADR-004 surface shipped. Cycle-detection + retired-trace exclusion proven by tests: TestHistory_CycleDetected, TestRetire_ExcludedFromSearch, TestRevise_ChainOfThree_BackwardWalk. JSONL append-only persistence with corruption tolerance.
Observer + workflow runner
observerdring buffer + JSONL persistence- Workflow DAG executor (Archon-style) with 5 native modes wired:
matrix.relevance,matrix.downgrade,matrix.search,distillation.score,drift.scorer. Plusfixture.echo/fixture.upperfor runner mechanics smokes.
Distillation + drift
- E (partial) at
57d0df1— scorer + contamination firewall ported from Rust v1.0.0 (logic only per ADR-001 §1.4; not bit-identical). - F (first slice) at
be65f85— drift quantification, scorer drift first.
chatd — Phase 4 (shipped 2026-04-30, scrum-hardened same day)
Multi-provider LLM dispatcher routing /v1/chat by model-name prefix or :cloud suffix:
| Prefix / suffix | Provider | Auth |
|---|---|---|
ollama/<m> or bare |
ollama (local) |
none |
ollama_cloud/<m> or <m>:cloud |
ollama_cloud |
Bearer (OLLAMA_CLOUD_KEY) |
openrouter/<v>/<m> |
openrouter |
Bearer (OPENROUTER_API_KEY) |
opencode/<m> |
opencode |
Bearer (OPENCODE_API_KEY) |
kimi/<m> |
kimi |
Bearer (KIMI_API_KEY) |
All 5 keys live in /etc/lakehouse/{ollama_cloud,openrouter,opencode,kimi}.env files (mode 0600). Empty/missing files leave that provider unregistered (404 at first call instead of 503). Test request: POST /v1/chat {"model":"opencode/claude-opus-4-7","messages":[{"role":"user","content":"hi"}],"max_tokens":8}.
Request.Temperature is *float64 (pointer) — Anthropic 4.7 deprecates temperature entirely, so we omit the field when caller doesn't set it.
Model tier registry
lakehouse.toml [models] names model IDs by tier so swaps are 1-line:
local_fast = "qwen3.5:latest"
local_judge = "qwen3.5:latest"
cloud_judge = "kimi-k2.6:cloud"
cloud_review = "qwen3-coder:480b"
frontier_review = "openrouter/anthropic/claude-opus-4-7"
frontier_arch = "openrouter/moonshotai/kimi-k2-0905"
frontier_free = "opencode/claude-opus-4-7"
weak_models = ["qwen3.5:latest", "qwen3:latest"] # matrix.downgrade bypass
Callers read cfg.Models.LocalJudge etc. instead of literal strings. playbook_lift harness, matrix.downgrade, and observerd's MatrixDowngradeWithWeakList factory all migrated.
Code health
go vet ./...→ 0 warnings, 0 errorsgo test -short ./...→ all green, 349 test functionsjust verify→ PASS (vet + tests + 9 smokes) in ~31s- 18 smoke scripts (9 core gating verify + 9 domain smokes for new daemons)
Latest scrum: 2026-04-30 cross-lineage wave
Composite 50/60 at scrum2 head c7e3124 (was 35 baseline → 43 R1 → 50 R2). Today's chatd wave reviewed by Opus + Kimi + Qwen3-coder via the chatd's own /v1/chat; 2 BLOCKs + 2 WARNs landed as fixes (0efc736); reusable driver at scripts/scrum_review.sh.
DO NOT RELITIGATE
Ratified ADRs (docs/DECISIONS.md)
- ADR-001: DuckDB via cgo, HTMX UI, Gitea hosting, distillation rebuilt-not-ported, pathway memory clean start, auditor longitudinal signal restarts. 6 sub-decisions, all final.
- ADR-002: storaged per-prefix PUT cap (4 GiB for
_vectors/, 256 MiB elsewhere) — implemented at423a381. Operator-config bump rather than constant change is the documented path if 4 GiB ever insufficient. - ADR-003: Inter-service auth = Bearer + IP allowlist, opt-in via
cfg.Auth.Token. Wiring deferred to Sprint 1 but the design is locked — alternatives (mTLS, JWT, OAuth2, IP-only) all considered + rejected. - ADR-004: Pathway memory = Mem0 versioned traces, JSONL append-only persistence, opaque
json.RawMessagecontent. Implemented ininternal/pathway/.
Today's scrum dispositions (2026-04-30)
Verbatim verdicts at reports/scrum/_evidence/2026-04-30/verdicts/. Disposition table: reports/scrum/_evidence/2026-04-30/disposition.md.
Real findings, all fixed in 0efc736:
- B-1 (Opus+Kimi convergent):
ResolveKey3-arg API → 2-arg - B-2 (Opus+Kimi convergent):
handleProvidersdirect map lookup, drop synthesis-via-Resolve - B-3 (Opus single, trace-verified):
OllamaCloud.Chatstripsollama_cloud/prefix correctly - B-4 (Opus single): Ollama
done_reasonsurfaced to FinishReason
False positives dismissed (3, documented):
- FP-A1: Kimi misread
TestMaybeDowngrade_WithConfigListassertion - FP-A2: Qwen claimed nil-deref in
MaybeDowngradethat doesn't exist - FP-C1: Opus claimed
qwen3.5:latestdoesn't exist on Ollama hub (it does on this box's local install)
Session frame (don't redo)
- The Rust legacy is maintenance-only until Go reaches feature parity. Don't propose ports of components already shipped here.
- The matrix indexer 5/5 components are shipped. Don't propose to "build the matrix indexer" — it's done.
qwen3.5:latestIS available locally on this box. Opus's hub-only knowledge is a known-stale signal; the chatd_smoke uses it daily.temperatureis omitted for Anthropic 4.7 (handled byRequest.Temperature *float64); don't re-add it.- chatd-smoke runs with all cloud providers disabled intentionally so the suite doesn't depend on API keys; that's why it can't catch B-3-class bugs (those need a fake-server fixture, see Sprint 0 follow-up).
OPEN — what's not done yet
| Item | What | When to act |
|---|---|---|
| Reality test for the 5-loop substrate | playbook_lift_001.json exists at reports/reality-tests/ but the harness hasn't been run against real queries yet (J held it). Driver: scripts/playbook_lift.sh. Needs J's 20+ staffing queries in tests/reality/playbook_lift_queries.txt first (5 placeholders shipped). |
When J supplies queries OR explicitly green-lights running with placeholders. |
cmd/{matrixd,observerd,pathwayd}/main_test.go absent |
3 new daemons each mount ≥4 routes with no wiring test. Original 6 binaries all closed via 0f79bce. New gap reopens R-005. |
~1 hr pattern-match against cmd/storaged/main_test.go. Cheap. |
| Sprint 4 — deployment | No REPLICATION.md, secrets-go.toml.example, deploy/systemd/<bin>.service, Dockerfile. Largest open Sprint. Required input for any G5 cutover plan. |
When G5 cutover is on the table. |
| ADR-005 — observer fail-safe semantics | Observer ported but the upstream "verdict:accept on crash" anti-pattern still has no Go-side decision locked. Doc-only, ~30 min. | Before observer is wired into production paths. |
| ADR-006 — auth posture for non-loopback deploy | Locks R-001 + R-007 from "opt-in middleware exists" to "wired-by-default for X, opt-in for Y." Doc-only, ~1 hr. | Required before any Go binary binds non-loopback in prod. |
| chatd fixture-mode storage half | g2_smoke_fixtures.sh closed embed half via fake_ollama; storage half (mock S3) still deferred. Closes R-006 fully. |
When CI box without MinIO is needed. |
| Distillation full port | 57d0df1 shipped scorer + contamination firewall (E partial); SFT export pipeline + audit_baselines lineage not yet ported. |
When distillation is needed for production. |
| Drift full quantification | be65f85 is "scorer drift first." Full distribution-drift signal underspecified everywhere — research gap, not a port. |
Open research item. |
RECENT VERIFIED WAVE (2026-04-30)
05273ac..e4ee002 — 4 phases + scrum + tooling, all gate-tested.
| SHA | What |
|---|---|
ec1d031 |
Phase 1: [models] tier config (additive, no callers migrate) |
622e124 |
Phase 2: matrix.downgrade reads cfg.Models.WeakModels |
848cbf5 |
Phase 3: playbook_lift harness defaults from config |
05273ac |
Phase 4: chatd + 5 providers (1,624 LoC) |
0efc736 |
Scrum: 4 fixes (B-1..B-4) + 2 INFOs from cross-lineage review |
e4ee002 |
scripts/scrum_review.sh — reusable 3-lineage driver |
Plus on Rust side (8de94eb, 3d06868): qwen2.5 → qwen3.5:latest backport in active defaults; distillation acceptance reports regenerated (run_hash refresh, reproducibility property still holds).
RUNTIME CHEATSHEET
# Verify everything green
cd /home/profit/golangLAKEHOUSE
just verify # vet + tests + 9 core smokes (~31s)
just doctor # dep probe (go/gcc/minio/ollama/secrets)
# Boot the chat dispatcher (Phase 4)
nohup ./bin/chatd -config lakehouse.toml > /tmp/chatd.log 2>&1 & disown
nohup ./bin/gateway -config lakehouse.toml > /tmp/gateway.log 2>&1 & disown
curl -sf http://127.0.0.1:3110/v1/chat/providers | jq # all 5 providers should report true
# Test a chat call to each lineage
for m in "qwen3.5:latest" "opencode/claude-opus-4-7" "openrouter/moonshotai/kimi-k2-0905"; do
curl -sS -X POST http://127.0.0.1:3110/v1/chat \
-H 'Content-Type: application/json' \
-d "{\"model\":\"$m\",\"messages\":[{\"role\":\"user\",\"content\":\"reply: OK\"}],\"max_tokens\":8}" \
| jq -c '{model,provider,content}'
done
# Run the scrum on a diff
./scripts/scrum_review.sh path/to/bundle.diff bundle_label
ls reports/scrum/_evidence/$(date +%Y-%m-%d)/verdicts/
# Domain smokes (not in `just verify`)
for s in chatd matrix observer pathway playbook relevance downgrade workflow; do
bash scripts/${s}_smoke.sh > /tmp/${s}.log 2>&1 && echo "$s ✓" || echo "$s ✗"
done
VISION — what we're actually building
J's framing (canonical at /root/.claude/projects/-home-profit/memory/project_small_model_pipeline_vision.md): a small-model-driven autonomous pipeline that gets better with each run. Frontier APIs (Opus, Kimi, GPT-5) are too expensive + rate-limited for the inner loop — they live in audit/oversight via frontier_* tier. The hot path runs on local qwen3.5:latest given:
- Pathway memory — what we tried before, how it went (Mem0 substrate ✓)
- Matrix indexer — multi-corpus retrieve+merge giving the small model the right slice for this task (5/5 components ✓)
- Observer — watches each run, refines configs (not prompts) toward good pathways
Successful runs get rated and distilled back into the playbook. Each iteration the playbook gets denser, runs get cheaper, results get better. Drift in the distilled playbook is a measured signal, not vibes.
The single load-bearing gate: "the playbook + matrix indexer must give the results we're looking for." Throughput, scaling, code elegance are all secondary. The playbook_lift reality test is the regression gate before Enterprise cutover (where real contracts + live profile updates land).
When evaluating any Go workstream, ask: which of the 5 loops does this advance? Strong workstreams advance ≥1; weak workstreams sit in infra-for-its-own-sake.