root 5a3364f539 matrix: judge-gated Shape B inject — closes lift-suite tail issues
Lift suite run #004 left two unresolved tail issues:
- Q6 ("Forklift loader") ↔ Q7 ("Hazmat warehouse, cold storage")
  swap recordings as warm top-1 because their embeddings are within
  0.20 cosine of each other. Distance gate can't tell them apart.
- Q9 + Q15 lose paraphrase recovery when qwen2.5 rephrases past the
  0.20 threshold. Distance says "drift too far"; sometimes the drift
  is real (skip), sometimes the paraphrase is still on-domain (don't
  want to skip).

Multi-coord run #008's judge re-rating proved the LLM can
distinguish: Q3 crane case landed at distance 0.23 (looks tight)
but rating 1 (irrelevant). The judge sees domain mismatch the
embedder doesn't.

This commit lifts that pattern into the matrix substrate. Shape B
inject now optionally routes every candidate through a judge gate
before the rank insert lands. Distance + judge BOTH have to approve.

internal/matrix/playbook.go:
- InjectPlaybookMisses signature gains a query string + an
  optional InjectGate. nil gate preserves pre-judge-gating
  behavior (current tests already pass with nil).
- New InjectGate interface + InjectGateFunc adapter for tests
  and non-LLM callers.
- Per-candidate gate.Approve(query, hit) call inserted between
  the dedup and the inject. Rejected candidates skip silently;
  injected count reflects post-gate decision.

internal/matrix/judge.go (new, ~140 lines):
- LLMJudgeGate calls an Ollama-shape /api/chat endpoint with the
  same 1-5 staffing-rubric prompt that worked in multi_coord
  run #008. fail-closed on HTTP/JSON errors (don't inject if
  judge can't speak — better miss than wrong-domain).
- NewLLMJudgeGate returns nil when URL or Model is empty,
  matching InjectGate's nil-means-no-judge semantics.

internal/matrix/retrieve.go:
- SearchRequest gains JudgeURL, JudgeModel, JudgeMinRating
  fields. Run() builds an LLMJudgeGate when set; passes nil
  otherwise. Backward compatible — existing callers see no
  behavior change.

Tests:
- TestInjectPlaybookMisses_GateRejectsCandidate (rejectAll → 0
  injected, even with tight distance)
- TestInjectPlaybookMisses_GateApprovesCandidate (approveAll →
  same as nil-gate behavior)
- TestInjectPlaybookMisses_GateSeesCorrectQuery (gate receives
  CURRENT query + RECORDED query separately so it can score
  the (current, candidate) pair)
- All 5 existing inject tests updated to new signature

go test ./internal/matrix → all 8 inject tests pass.
go test ./internal/matrix ./internal/shared ./cmd/{matrixd,
queryd,pathwayd,observerd} → all green.

STATE_OF_PLAY:
- OPEN item #1 (judge-gated injection) closed.
- DO NOT RELITIGATE adds the substrate-level judge-gate lock.
- OPEN list now 5 rows (was 6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 19:38:12 -05:00
..