docs: SPEC §3.9 (chatd) + §3.10 (local-review-harness sibling)

- SPEC §1 component table: add chatd row marked DONE; replaces Rust gateway's v1::ollama_cloud / openrouter / opencode adapters + the aibridge crate. - SPEC §3.9 — chatd shipped: 5-provider routing (ollama, ollama_cloud, openrouter, opencode, kimi) by model-name prefix or :cloud suffix. Captures the Anthropic 4.7 temperature-deprecation quirk + the local-Ollama think=false default that the playbook_lift judge needed. Mentions scrum_review.sh as the reusable cross-lineage vehicle eating chatd's own /v1/chat. - SPEC §3.10 — local-review-harness sibling tool: separate repo at git.agentview.dev/profit/local-review-harness, MVP shipped today. Documents the cross-pollination plan for when both substrates stabilize (chatd as the harness's LLM backend; harness findings as Lakehouse pathway-memory drift signal; .memory/known-risks as a matrix corpus). Explicit "don't re-port" so future Claudes don't try to absorb the harness into Lakehouse. - STATE_OF_PLAY.md: SIBLING TOOLS section with 1-line summary + pointer to SPEC §3.10. No code changes. just verify still PASS — touched only docs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 01:01:23 -05:00 · 2026-04-30 01:01:23 -05:00 · 511083ae40
commit 511083ae40
parent c5c31b6ca6
2 changed files with 98 additions and 0 deletions
--- a/STATE_OF_PLAY.md
+++ b/STATE_OF_PLAY.md
@ -209,3 +209,16 @@ Successful runs get **rated and distilled back into the playbook**. Each iterati
 **The single load-bearing gate:** *"the playbook + matrix indexer must give the results we're looking for."* Throughput, scaling, code elegance are all secondary. The `playbook_lift` reality test is the regression gate before Enterprise cutover (where real contracts + live profile updates land).

 When evaluating any Go workstream, ask: which of the 5 loops does this advance? Strong workstreams advance ≥1; weak workstreams sit in infra-for-its-own-sake.
+
+---
+
+## SIBLING TOOLS (separate repos, intentional integration target later)
+
+**`local-review-harness`** at `git.agentview.dev/profit/local-review-harness` (also SMB-mounted at `/home/profit/share/local-review-harness-full-md/`). Local-first code review harness — 12 evidence-bearing static analyzers, Scrum-style reports, no cloud deps. Phase A + B (MVP) shipped 2026-04-30. Phases C–E (Ollama LLM review, validation, memory) pending.
+
+**Cross-pollination plan when both stabilize:**
+- Replace harness's `internal/llm/ollama.go` with a chatd `/v1/chat` client → frontier judges via config toggle
+- Feed harness findings into Lakehouse pathway memory as a drift signal
+- Treat harness's `.memory/known-risks.json` as a matrix-indexer corpus
+
+Detail at `docs/SPEC.md` §3.10. Don't re-port harness functionality into Lakehouse-Go — the standalone tool is the design.
--- a/docs/SPEC.md
+++ b/docs/SPEC.md
@ -32,6 +32,7 @@ Effort scale (one engineer-week = ~40h focused work):
 | `vectord-lance` | lance | **DROPPED** | n/a | n/a | n/a — Parquet+HNSW only |
 | `journald` | parquet, arrow | `cmd/journald` | `apache/arrow-go/v18` | **M** | low |
 | `aibridge` | reqwest | library | `net/http` + connection pool · `anthropics/anthropic-sdk-go` available for direct Claude calls (currently routed via opencode) | **S** | low |
+| **chatd** (Phase 4 — multi-provider LLM dispatcher) | crates/gateway/src/v1/{ollama_cloud,openrouter,opencode}.rs | `cmd/chatd` + `internal/chat/` | stdlib `net/http` only | **DONE** | medium — see §3.9. 5-provider routing (ollama / ollama_cloud / openrouter / opencode / kimi) by model-name prefix or `:cloud` suffix. Replaces the Rust gateway's `v1::` adapters. |
 | `validator` | parquet, custom | library | `apache/arrow-go/v18` parquet reader | **M** | low — port the 24 unit tests as gates |
 | `truth` | tomli, custom DSL | library | `pelletier/go-toml/v2` | **M** | low |
 | `proto` | tonic-build | `proto/` + `protoc-gen-go` | `buf` + `protoc-gen-go-grpc` | **S** | low |
@ -398,6 +399,90 @@ re-implement the recording layer.
 implementation is its own wave (estimated **L** effort given the
 DAG runner + mode dispatch + provenance recording).

+### §3.9 — chatd (multi-provider LLM dispatcher) — SHIPPED 2026-04-30
+
+**Status:** done at commit `05273ac` (Phase 4 wave) + scrum-hardened
+at `0efc736`. Composite port: 1,624 LoC, 19+ tests, 6/6 chatd_smoke.
+
+**What:** `cmd/chatd` on `:3220` routes `POST /chat` to a provider
+selected by model-name prefix or `:cloud` suffix:
+
+```
+ollama/<m>            → local Ollama at :11434 (no auth)
+ollama_cloud/<m>      → ollama.com /api/generate (Bearer)
+<m>:cloud             → ollama_cloud (suffix variant)
+openrouter/<v>/<m>    → openrouter.ai (OpenAI-compat, Bearer)
+opencode/<m>          → opencode.ai/zen/v1 (OpenAI-compat, Bearer)
+kimi/<m>              → api.kimi.com/coding/v1 (OpenAI-compat, Bearer)
+bare names            → ollama (default)
+```
+
+**Provider key resolution:** env var first (`OPENROUTER_API_KEY`,
+`OPENCODE_API_KEY`, `KIMI_API_KEY`, `OLLAMA_CLOUD_KEY`); then
+`/etc/lakehouse/<provider>.env` fallback (mode 0600). Empty key →
+provider stays unregistered (404 at first call instead of 503).
+
+**Companion to the model tier registry** in `lakehouse.toml [models]`
+(local_fast / local_judge / cloud_judge / frontier_review / etc.),
+which maps tier names to model IDs. Callers reference
+`cfg.Models.LocalJudge` instead of literal strings. Bumping a tier
+is a 1-line config edit.
+
+**Quirks captured by today's scrum:**
+- `Request.Temperature` is `*float64` (pointer) — Anthropic 4.7 (via
+  OpenCode) rejects the field entirely with "temperature is deprecated."
+  Pointer lets us omit when caller didn't set explicitly.
+- Local Ollama defaults `think=false` — qwen3.5:latest is reasoning-
+  capable but the inner-loop hot path wants direct answers, not
+  reasoning traces consuming the token budget.
+
+**Replaces the Rust adapters:**
+- `crates/gateway/src/v1/ollama_cloud.rs` → `internal/chat/ollama_cloud.go`
+- `crates/gateway/src/v1/openrouter.rs` → `internal/chat/openai_compat.go` (shared with opencode + kimi)
+- `crates/gateway/src/v1/opencode.rs` → same shared helper
+- `crates/aibridge/*` → `internal/chat/` (cleaner abstraction)
+
+**Reusable downstream:** `scripts/scrum_review.sh` runs a 3-lineage
+cross-review (Opus + Kimi + Qwen3-coder) by POST-ing each diff to
+chatd's `/v1/chat`. Same vehicle the harness's own scrum-hardening
+used to find its own bugs.
+
+### §3.10 — local-review-harness (sibling tool, separate repo)
+
+**Where:** `git.agentview.dev/profit/local-review-harness` (also
+SMB-mounted at `/home/profit/share/local-review-harness-full-md/`).
+
+**What:** a local-first code review harness — walks a target repo,
+runs evidence-bearing static checks (12 analyzers covering hardcoded
+paths, raw SQL, wildcard CORS, secrets, exec/spawn, large files,
+TODO/FIXME, missing tests, .env files committed, exposed mutation
+endpoints, hardcoded private IPs), produces Scrum-style markdown
+reports + JSON receipts. **No cloud deps.** Single static Go binary.
+
+**Status:** Phase A (skeleton) + Phase B (MVP — static-only path)
+shipped at first commit `f3ee472`. 5 acceptance gates green plus
+self-review (the harness reviews its own repo).
+
+**Phases C–E pending:** local-Ollama LLM review, validation cross-
+check, append-only `.memory/`, diff/rules subcommands.
+
+**Why a sibling tool, not a Lakehouse module:** PROMPT.md "Strategic
+Goal" — the harness eventually plugs into OpenClaw / MCP tools /
+Lakehouse memory / playbook sealing / observer review loop. But
+first it has to be reliable, inspectable, and evidence-driven on
+its own. Lakehouse-Go integration is post-Phase-E (probably E+1:
+write the harness's findings into Lakehouse's playbook substrate
+via `/v1/matrix/playbooks/record`).
+
+**Cross-pollination opportunities (post-MVP):**
+- Replace `internal/llm/ollama.go` in the harness with a thin client
+  pointed at chatd's `/v1/chat` — frontier judges become a config
+  toggle.
+- Feed harness findings into Lakehouse-Go's pathway memory as a
+  drift signal (which static checks fired this run vs last).
+- Use the harness's `.memory/known-risks.json` as a corpus the
+  matrix indexer can retrieve from when the same risk pattern appears.
+
 ### §3.3 — UI (HTMX)

 **Approach:** server-rendered Go templates using `html/template`,