root b199093d1f B: matrix metadata filter — post-retrieval structured gate
Addresses the reality-test gap surfaced by the candidates and
multi-corpus e2e runs (0d1553c, a97881d): semantic-only retrieval
can't gate by status / state / availability. SearchRequest now
takes an optional MetadataFilter map; results whose metadata
doesn't match every key are dropped before top-K truncation.

Filter value semantics:
  string|number|bool → exact equality (JSON-canonical, so 1 ≡ 1.0)
  []any              → OR within key (any element matching wins)
  AND across keys: every filter key must match.

Missing key in metadata = drop. Malformed metadata = drop. Filter
absent or empty = pass through (zero overhead).

The response now reports MetadataFilterDropped so callers can see
how aggressive the filter was without re-querying.

Caveat (also captured in code comment): this is POST-retrieval, not
PRE-filtering via SQL. Aggressive filters can shrink the result set
below K; caller should bump PerCorpusK to compensate. A queryd-
backed pre-filter is a future commit; this lands the user-visible
fix today.

Tests:
  - 7 unit tests (internal/matrix/filter_test.go) covering: nil/
    empty filter pass-through, missing-metadata always-fails,
    single-value exact match (incl. numeric 5 ≡ 5.0), AND across
    keys, OR within list, bool match, malformed JSON metadata
  - matrix_smoke.sh: new assertion #7 — filter
    label∈{"a near","b near"} drops the 4 mid/far entries from the
    6-entry pool, keeping exactly 2 (one per corpus, both with the
    matching label). Dropped count surfaces in the response.

15-smoke regression all green. vet clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 20:08:56 -05:00
..