4 Commits

Author SHA1 Message Date
root
06e71520c4 matrix: playbook memory + boost — SPEC §3.4 component 5 of 5 (LEARNING LOOP)
Closes SPEC §3.4. The matrix indexer is now a learning meta-index per
feedback_meta_index_vision.md — every successful (query → answer)
pair recorded via /matrix/playbooks/record boosts that answer for
future similar queries.

This is the architectural piece that lifts vectord from "static
hybrid search" to the meta-index J originally framed in Phase 19 of
the Rust system.

What's new:
  - internal/matrix/playbook.go — PlaybookEntry, PlaybookHit,
    ApplyPlaybookBoost. Pure-function boost math:
      distance' = distance * (1 - 0.5 * score)
    Score 0 = no boost (factor 1.0); score 1 = halve distance
    (factor 0.5). Capped at 0.5 deliberately so a single high-
    confidence playbook can't dominate the base ranking forever
    (runaway-feedback-loop guard).
  - Retriever.Record(entry, corpus) — embeds query_text, ensures
    playbook corpus exists (idempotent), upserts via deterministic
    sha256-derived ID (last score wins on re-record of same triple).
  - Retriever.Search extended with UsePlaybook + PlaybookCorpus +
    PlaybookTopK + PlaybookMaxDistance. Reuses the query vector —
    no extra embed call. Missing-corpus 404 = no-op (cold-start
    state before any Record call), not an error.
  - POST /v1/matrix/playbooks/record (matrixd) — caller submits
    {query_text, answer_id, answer_corpus, score, tags?}; gets
    {playbook_id} back.

Storage: a vectord index named "playbook_memory" (configurable per
request) with embed(query_text) as the vector and the
PlaybookEntry JSON as metadata. Just another corpus — observable
from /vectors/index, persistable through G1P, etc.

Match key for boost: (AnswerID, AnswerCorpus). Cross-corpus ID
collisions don't false-match — verified by
TestApplyPlaybookBoost_CorpusAttributionRespected.

End-to-end smoke (scripts/playbook_smoke.sh, all assertions PASS):
  - Baseline search: widget-c at distance 0.6566 (rank 3)
  - Record playbook: query → widget-c, score=1.0
  - Re-search with use_playbook=true:
      widget-c distance: 0.3283 (rank 2)
      ratio: 0.5 EXACTLY (matches boost math precisely)
      playbook_boosted: 1
  - widget-c jumped from #3 to #2 — learning loop visible

Tests:
  - 8 unit tests in internal/matrix/playbook_test.go covering
    Validate, BoostFactor (5 cases), the no-boost identity, the
    boost-moves-result-up scenario, highest-score wins on duplicate
    matches, cross-corpus attribution, JSON round-trip, and
    rejection of empty metadata
  - scripts/playbook_smoke.sh integration test (3 assertions PASS)

15-smoke regression sweep all green (D1-D6, G1, G1P, G2,
storaged_cap, pathway, matrix, relevance, downgrade, playbook).

SPEC §3.4 NOW COMPLETE: 5 of 5 components shipped. The matrix
indexer's port is done as a substrate; remaining work is operational
(rating signal sources, telemetry, eventual structured filtering for
staffing data — none in §3.4).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 19:34:24 -05:00
root
3968ec8a7b matrix: strong-model downgrade gate — SPEC §3.4 component 4 of 5
Pure-Go port of mode.rs::execute's pass5 downgrade gate (Rust
2026-04-26). Adds POST /v1/matrix/downgrade endpoint via matrixd.

The gate captures the pass5 finding: composing matrix corpora into
codereview_lakehouse on a strong model LOST 5/5 head-to-head reps
against matrix-free codereview_isolation on grok-4.1-fast (p=0.031).
Strong models have enough native capacity that bug fingerprints +
adversarial framing + file content carry them; matrix chunks
displace depth-of-analysis.

Logic (matches Rust mode.rs:614-632):
  if mode == codereview_lakehouse
     && !forced_mode
     && !LH_FORCE_FULL_ENRICHMENT
     && !is_weak_model(model)
  → flip to codereview_isolation, record downgraded_from

is_weak_model captures the empirical weak-list:
  - `:free` suffix or `:free/` infix (OpenRouter free tier)
  - qwen3.5:latest, qwen3:latest (local last-resort rungs)
  - everything else → strong by default

Tests:
  - 3 unit tests in internal/matrix/downgrade_test.go: IsWeakModel
    coverage, MaybeDowngrade truth table (5 rows), forced-mode
    precedence (forced beats every other bypass)
  - scripts/downgrade_smoke.sh: 6 assertions through gateway covering
    all 5 truth-table rows + empty-mode 400

14-smoke regression sweep all green (D1-D6, G1, G1P, G2,
storaged_cap, pathway, matrix, relevance, downgrade).

SPEC §3.4 progress: 4 of 5 components shipped (corpus builders,
multi-corpus retrieve+merge, relevance filter, downgrade gate).
Last component is learning-loop integration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 19:17:55 -05:00
root
9588bd82ae matrix: relevance filter — SPEC §3.4 component 3 of 5
Faithful port of mcp-server/relevance.ts (Rust observer's adjacency-
pollution filter). Same 5-signal scoring, same default threshold 0.3.
Adds POST /v1/matrix/relevance endpoint via matrixd.

Scoring signals (additive, can sign-flip):
  path_match     +1.0  chunk source/doc_id encodes focus.path
  filename_match +0.6  chunk text mentions focus's filename
  defined_match  +0.6  chunk text mentions focus.defined_symbols
  token_overlap  +0.4  jaccard of non-stopword tokens
  prefix_match   +0.3  chunk source shares first-2-segment prefix
  import_penalty -0.5  mentions ONLY imported symbols, no defined ones

What this does and doesn't do:
  - DOES filter code-aware corpora (eventually lakehouse_arch_v1,
    lakehouse_symbols_v1, scrum_findings_v1) — drops chunks about
    code the focus file IMPORTS rather than DEFINES, the
    "adjacency pollution" pattern that makes a reviewer LLM
    hallucinate imported-crate internals as belonging to the focus
  - DOES NOT meaningfully filter staffing data — the candidates
    reality test 2026-04-29 had "exact skill match buried at #3"
    which is a different problem (semantic-only ranking dominated
    by secondary text). Staffing needs structured filtering
    (status gates, location gates) that lives outside this
    package — future work, not in SPEC §3.4 yet

Headline smoke assertion: focus = crates/queryd/src/db.go which
defines Connector and imports catalogd::Registry. The filter
scores:
  Connector chunk: +0.68  (defined_match fires, kept)
  Registry chunk: -0.46  (import_only penalty fires, dropped)
  unrelated junk:  0.00  (no signals, dropped)

That's a 1.14-point gap between what we ARE and what we IMPORT —
the entire purpose of the filter.

Tests:
  - 9 unit tests in internal/matrix/relevance_test.go covering
    Tokenize, Jaccard, ExtractDefinedSymbols (Rust + TS),
    ExtractImportedSymbols, FilePrefix, ScoreRelevance per-signal,
    FilterChunks threshold splitting, and the headline
    AdjacencyPollutionScenario
  - scripts/relevance_smoke.sh integration smoke (3 assertions PASS):
    adjacency-pollution scenario, empty-chunks 400, threshold honored

13-smoke regression sweep all green (D1-D6, G1, G1P, G2,
storaged_cap, pathway, matrix, relevance).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 19:13:22 -05:00
root
c1d96b7b60 matrixd: multi-corpus retrieve+merge — SPEC §3.4 component 2 of 5
Lands the matrix indexer's first piece per docs/SPEC.md §3.4:
multi-corpus retrieve+merge with corpus attribution per result.
Future components (relevance filter, downgrade gate, learning-loop
integration) layer on top of this surface.

Architecture:
  - internal/matrix/retrieve.go — Retriever takes (query, corpora,
    k, per_corpus_k), parallel-fans across vectord indexes, merges
    by distance ascending, preserves corpus origin per hit
  - cmd/matrixd — HTTP service on :3217, fronts /v1/matrix/*
  - gateway proxy + [matrixd] config + lakehouse.toml entry
  - Either query_text (matrix calls embedd) or query_vector
    (caller pre-embedded) — vector takes precedence if both set

Error policy: fail-loud on any corpus error. Silent partial returns
would lie about coverage, defeating the matrix's whole purpose.
Bubbles vectord errors as 502 (upstream), validation as 400.

Smoke (scripts/matrix_smoke.sh, 6 assertions PASS first try):
  - /matrix/corpora lists indexes
  - Multi-corpus search returns hits from BOTH corpora
  - Top hit is the globally-closest across all corpora
    (b-near beats a-near at distance 0.05 vs 0.1 — proves merge)
  - Metadata round-trips through the merge
  - Distances ascending in result list
  - Negative paths: empty corpora → 400, missing corpus → 502,
    no query → 400

12-smoke regression sweep all green (D1-D6, G1, G1P, G2,
storaged_cap, pathway, matrix).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 18:39:17 -05:00