# Cohesion integration plan — the "smarter DB" loop **Written:** 2026-04-22, after J flagged that the system has good parts but they don't compose into the self-improving loop promised in the control-plane thesis (`project_control_plane_thesis.md` memory: 0→85% via hyperfocus-then-escalate). ## The gap Each piece works in isolation. What's not wired: ``` ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ observer │──✓──│ data/_kb/ │──?──│ auditor │ │ :3800 │ │ outcomes. │ │ kb_query │ │ │ │ jsonl │ │ check │ └─────────────┘ └─────────────┘ └─────────────┘ ▲ │ │? ▼ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ auditor │─────│ data/_audit │ │ cloud │ │ verdicts │ │or/verdicts/ │ │ inference │ │ │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │? ▼ ┌─────────────┐ │ hybrid_srch │ │ playbook_mem│ │ context7 │ └─────────────┘ ``` - **✓** solid: observer captures scenarios, writes `_kb/outcomes.jsonl`. - **?** missing: - auditor verdicts go to `_auditor/verdicts/` — not to observer or KB - auditor kb_query reads `_kb/outcomes.jsonl` but doesn't query hybrid_search, playbook_memory, or context7 - auditor inference sends claim+diff to cloud — but WITHOUT KB neighbors, WITHOUT drift context, WITHOUT tool-aware retrieval - Result: every audit is stateless-ish. The DB doesn't get smarter across audits. ## The target loop ``` PR opened ↓ auditor fetches diff + claims ↓ auditor enriches with: — KB neighbors: past verdicts on similar claim-hashes — hybrid_search: playbooks for task classes the diff touches — context7 drift status: for any tool named in commit messages — MCP tools: agent-style queries if relevant ↓ cloud inference sees: diff + claims + enrichment ↓ verdict posted to Gitea ↓ verdict ALSO persisted to: — observer :3800/event (source=auditor) — _kb/outcomes.jsonl (kind=audit, sig_hash=pr+sha) ↓ next audit on similar PR: — KB neighbors find this prior verdict — cloud sees "similar PRs ended in request_changes for reason X" — verdict is more calibrated ``` ## Phases of closure Building this in order from least-invasive: ### Phase A — Verdict indexing (shipped in this PR) After every `auditPr()` completes, in addition to persisting to `_auditor/verdicts/`: 1. POST the verdict to observer `:3800/event` with `source: "auditor"`, `ok: verdict === "approve"`, `event_kind: "audit"`, `sig_hash: `. 2. Append a simplified outcome to `data/_kb/outcomes.jsonl` with a special `kind: "audit"` row so the KB surface treats audits alongside scenarios. Minimal surgery. Doesn't change the verdict itself. Just makes it visible to downstream consumers. **Test:** after this PR lands, re-run the auditor once; observer stats should show `by_source.auditor > 0`; `_kb/outcomes.jsonl` should have one new row per audited SHA. ### Phase B — KB query sees auditor history Extend `auditor/checks/kb_query.ts` to: 1. Read `_kb/outcomes.jsonl`, filter `kind === "audit"`, find prior audits with matching `sig_hash` (same PR across SHAs) or similar claim hashes. 2. Emit finding: "prior N audits on this PR ended in [approve, block, request_changes], last reason was X." Gives the auditor memory across re-audits of the same PR. ### Phase C — Hybrid search in kb_query When a PR touches `crates/vectord/src/playbook_memory.rs` (task-class matching), auditor calls `POST /vectors/hybrid` to find playbooks semantically related to the diff. Surfaces as "N playbooks in production rely on this code path, consider backward-compat." Requires: - Task-class extractor from diff paths - Cloud-free hybrid_search call (we have local Ollama for embeddings) ### Phase D — Context7 drift awareness in auditor If the diff / commit message names a tool that's in any playbook's `doc_refs`, auditor calls the context7 bridge to check current drift status. Surfaces as "tool X has drifted since N playbooks referenced it; this PR may need to update those." ### Phase E — Inference sees enrichment The cloud inference check currently sends `diff + claims`. Extend to send `diff + claims + kb_neighbors + drift_context + hybrid_search_matches`. The prompt becomes context-rich — exactly what makes the cloud model competent (per the control-plane thesis). ### Phase F — Full integration test An auditor self-test that: 1. Creates a synthetic PR with known-good and known-bad claims 2. Runs the enriched auditor 3. Asserts the verdict found the planted issues 4. Runs a second identical-claim PR 5. Asserts the SECOND verdict references the FIRST audit (via KB neighbor retrieval) 6. "Smarter DB" proof: two runs, measurable context gain. ## Sequence Phase A lands in this PR alongside the inventory. Phases B-F are follow-up PRs, each with their own auditor gate. Order matters: A → B (reads A's output) → C+D (in parallel) → E (consumes B/C/D) → F (asserts E's behavior). ## Not in this plan (deliberate) - **Rewriting the auditor to use a different architecture.** Current 4-check model stays; we just enrich the checks. - **Tuning cloud inference precision.** The false-positive rate is a prompt-engineering concern; this plan is about context enrichment, which is separate. - **Branch protection enforcement.** Stays off until Phase F passes. The overall bet: this is the "putting it all together coherently" J said was a real problem. Six phases over however many PRs it takes. Each one ships one wire of the loop; no single PR tries to do them all.