Closes SPEC §3.4. The matrix indexer is now a learning meta-index per
feedback_meta_index_vision.md — every successful (query → answer)
pair recorded via /matrix/playbooks/record boosts that answer for
future similar queries.
This is the architectural piece that lifts vectord from "static
hybrid search" to the meta-index J originally framed in Phase 19 of
the Rust system.
What's new:
- internal/matrix/playbook.go — PlaybookEntry, PlaybookHit,
ApplyPlaybookBoost. Pure-function boost math:
distance' = distance * (1 - 0.5 * score)
Score 0 = no boost (factor 1.0); score 1 = halve distance
(factor 0.5). Capped at 0.5 deliberately so a single high-
confidence playbook can't dominate the base ranking forever
(runaway-feedback-loop guard).
- Retriever.Record(entry, corpus) — embeds query_text, ensures
playbook corpus exists (idempotent), upserts via deterministic
sha256-derived ID (last score wins on re-record of same triple).
- Retriever.Search extended with UsePlaybook + PlaybookCorpus +
PlaybookTopK + PlaybookMaxDistance. Reuses the query vector —
no extra embed call. Missing-corpus 404 = no-op (cold-start
state before any Record call), not an error.
- POST /v1/matrix/playbooks/record (matrixd) — caller submits
{query_text, answer_id, answer_corpus, score, tags?}; gets
{playbook_id} back.
Storage: a vectord index named "playbook_memory" (configurable per
request) with embed(query_text) as the vector and the
PlaybookEntry JSON as metadata. Just another corpus — observable
from /vectors/index, persistable through G1P, etc.
Match key for boost: (AnswerID, AnswerCorpus). Cross-corpus ID
collisions don't false-match — verified by
TestApplyPlaybookBoost_CorpusAttributionRespected.
End-to-end smoke (scripts/playbook_smoke.sh, all assertions PASS):
- Baseline search: widget-c at distance 0.6566 (rank 3)
- Record playbook: query → widget-c, score=1.0
- Re-search with use_playbook=true:
widget-c distance: 0.3283 (rank 2)
ratio: 0.5 EXACTLY (matches boost math precisely)
playbook_boosted: 1
- widget-c jumped from #3 to #2 — learning loop visible
Tests:
- 8 unit tests in internal/matrix/playbook_test.go covering
Validate, BoostFactor (5 cases), the no-boost identity, the
boost-moves-result-up scenario, highest-score wins on duplicate
matches, cross-corpus attribution, JSON round-trip, and
rejection of empty metadata
- scripts/playbook_smoke.sh integration test (3 assertions PASS)
15-smoke regression sweep all green (D1-D6, G1, G1P, G2,
storaged_cap, pathway, matrix, relevance, downgrade, playbook).
SPEC §3.4 NOW COMPLETE: 5 of 5 components shipped. The matrix
indexer's port is done as a substrate; remaining work is operational
(rating signal sources, telemetry, eventual structured filtering for
staffing data — none in §3.4).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
176 lines
7.1 KiB
Bash
Executable File
176 lines
7.1 KiB
Bash
Executable File
#!/usr/bin/env bash
|
||
# Playbook smoke — learning-loop integration end-to-end.
|
||
# All assertions go through gateway :3110.
|
||
#
|
||
# Validates the full boost cycle:
|
||
# 1. Build a test corpus with 3 items
|
||
# 2. Query → get baseline ranking
|
||
# 3. Record a playbook: query → bottom-ranked answer with score=1.0
|
||
# 4. Re-query with use_playbook=true
|
||
# 5. Assert: the recorded answer's distance ≈ 0.5 × baseline (boost
|
||
# math: distance' = distance × (1 - 0.5×score))
|
||
# 6. Assert: PlaybookBoosted >= 1 in the response
|
||
#
|
||
# Requires Ollama on :11434 with nomic-embed-text loaded — Record
|
||
# embeds the query_text. Skips (exit 0) when Ollama is absent.
|
||
|
||
set -euo pipefail
|
||
cd "$(dirname "$0")/.."
|
||
|
||
export PATH="$PATH:/usr/local/go/bin"
|
||
|
||
if ! curl -sS --max-time 3 http://localhost:11434/api/tags >/dev/null 2>&1; then
|
||
echo "[playbook-smoke] Ollama not reachable on :11434 — skipping"
|
||
exit 0
|
||
fi
|
||
|
||
echo "[playbook-smoke] building stack..."
|
||
go build -o bin/ ./cmd/embedd ./cmd/vectord ./cmd/matrixd ./cmd/gateway
|
||
|
||
pkill -f "bin/(embedd|vectord|matrixd|gateway)" 2>/dev/null || true
|
||
sleep 0.3
|
||
|
||
PIDS=()
|
||
TMP="$(mktemp -d)"
|
||
CFG="$TMP/playbook.toml"
|
||
|
||
cleanup() {
|
||
echo "[playbook-smoke] cleanup"
|
||
for p in "${PIDS[@]}"; do [ -n "$p" ] && kill "$p" 2>/dev/null || true; done
|
||
rm -rf "$TMP"
|
||
}
|
||
trap cleanup EXIT INT TERM
|
||
|
||
cat > "$CFG" <<EOF
|
||
[gateway]
|
||
bind = "127.0.0.1:3110"
|
||
storaged_url = "http://127.0.0.1:3211"
|
||
catalogd_url = "http://127.0.0.1:3212"
|
||
ingestd_url = "http://127.0.0.1:3213"
|
||
queryd_url = "http://127.0.0.1:3214"
|
||
vectord_url = "http://127.0.0.1:3215"
|
||
embedd_url = "http://127.0.0.1:3216"
|
||
pathwayd_url = "http://127.0.0.1:3217"
|
||
matrixd_url = "http://127.0.0.1:3218"
|
||
|
||
[vectord]
|
||
bind = "127.0.0.1:3215"
|
||
storaged_url = ""
|
||
|
||
[matrixd]
|
||
bind = "127.0.0.1:3218"
|
||
embedd_url = "http://127.0.0.1:3216"
|
||
vectord_url = "http://127.0.0.1:3215"
|
||
EOF
|
||
|
||
poll_health() {
|
||
local port="$1" deadline=$(($(date +%s) + 5))
|
||
while [ "$(date +%s)" -lt "$deadline" ]; do
|
||
if curl -sS --max-time 1 "http://127.0.0.1:$port/health" >/dev/null 2>&1; then return 0; fi
|
||
sleep 0.05
|
||
done
|
||
return 1
|
||
}
|
||
|
||
echo "[playbook-smoke] launching embedd → vectord → matrixd → gateway..."
|
||
./bin/embedd -config "$CFG" > /tmp/embedd.log 2>&1 & PIDS+=($!)
|
||
poll_health 3216 || { echo "embedd failed"; tail /tmp/embedd.log; exit 1; }
|
||
./bin/vectord -config "$CFG" > /tmp/vectord.log 2>&1 & PIDS+=($!)
|
||
poll_health 3215 || { echo "vectord failed"; tail /tmp/vectord.log; exit 1; }
|
||
./bin/matrixd -config "$CFG" > /tmp/matrixd.log 2>&1 & PIDS+=($!)
|
||
poll_health 3218 || { echo "matrixd failed"; tail /tmp/matrixd.log; exit 1; }
|
||
./bin/gateway -config "$CFG" > /tmp/gateway.log 2>&1 & PIDS+=($!)
|
||
poll_health 3110 || { echo "gateway failed"; tail /tmp/gateway.log; exit 1; }
|
||
|
||
FAILED=0
|
||
|
||
# Embed three corpus items + the query, all via /v1/embed.
|
||
echo "[playbook-smoke] embedding 3 corpus items + query..."
|
||
EMBEDS="$(curl -sS -X POST http://127.0.0.1:3110/v1/embed \
|
||
-H 'Content-Type: application/json' \
|
||
-d '{"texts":["alpha staffing query test","bravo distinct content","charlie unrelated topic","alpha staffing query test full prompt"]}')"
|
||
V_A="$(echo "$EMBEDS" | jq -c '.vectors[0]')"
|
||
V_B="$(echo "$EMBEDS" | jq -c '.vectors[1]')"
|
||
V_C="$(echo "$EMBEDS" | jq -c '.vectors[2]')"
|
||
V_Q="$(echo "$EMBEDS" | jq -c '.vectors[3]')"
|
||
|
||
# Build corpus
|
||
echo "[playbook-smoke] create corpus widgets + add 3 items..."
|
||
curl -sS -o /dev/null -X POST http://127.0.0.1:3110/v1/vectors/index \
|
||
-H 'Content-Type: application/json' \
|
||
-d '{"name":"widgets","dimension":768,"distance":"cosine"}'
|
||
curl -sS -o /dev/null -X POST http://127.0.0.1:3110/v1/vectors/index/widgets/add \
|
||
-H 'Content-Type: application/json' \
|
||
-d "$(jq -n --argjson va "$V_A" --argjson vb "$V_B" --argjson vc "$V_C" \
|
||
'{items:[
|
||
{id:"widget-a", vector:$va, metadata:{label:"a"}},
|
||
{id:"widget-b", vector:$vb, metadata:{label:"b"}},
|
||
{id:"widget-c", vector:$vc, metadata:{label:"c"}}
|
||
]}')"
|
||
|
||
# Baseline matrix search (no playbook) — using query_vector to skip
|
||
# embedd round-trip and keep the test deterministic on the geometry
|
||
# we know.
|
||
echo "[playbook-smoke] baseline search (no playbook):"
|
||
BASELINE="$(curl -sS -X POST http://127.0.0.1:3110/v1/matrix/search \
|
||
-H 'Content-Type: application/json' \
|
||
-d "$(jq -n --argjson v "$V_Q" '{query_vector:$v, corpora:["widgets"], k:3}')")"
|
||
BASE_ORDER="$(echo "$BASELINE" | jq -r '[.results[].id] | join(",")')"
|
||
BASE_C_DIST="$(echo "$BASELINE" | jq -r '[.results[] | select(.id=="widget-c")] | .[0].distance // -1')"
|
||
echo " baseline order: $BASE_ORDER widget-c distance=$BASE_C_DIST"
|
||
|
||
# Record a playbook entry for the query → widget-c (use the same
|
||
# query_text that the playbook will be re-queried by, exact match).
|
||
QUERY_TEXT="alpha staffing query test full prompt"
|
||
echo "[playbook-smoke] record playbook: ($QUERY_TEXT) → widget-c score=1.0"
|
||
RECORD_RESP="$(curl -sS -X POST http://127.0.0.1:3110/v1/matrix/playbooks/record \
|
||
-H 'Content-Type: application/json' \
|
||
-d "$(jq -n --arg q "$QUERY_TEXT" \
|
||
'{query_text:$q, answer_id:"widget-c", answer_corpus:"widgets", score:1.0, tags:["smoke"]}')")"
|
||
PB_ID="$(echo "$RECORD_RESP" | jq -r '.playbook_id // empty')"
|
||
if [ -z "$PB_ID" ]; then
|
||
echo " ✗ no playbook_id in response: $RECORD_RESP"; FAILED=1
|
||
else
|
||
echo " ✓ playbook_id=$PB_ID"
|
||
fi
|
||
|
||
# Re-search with use_playbook=true. Use query_text so matrixd embeds
|
||
# it again (proves end-to-end). The newly-recorded playbook entry has
|
||
# the SAME query_text → cosine distance ~0 → boost applies to widget-c.
|
||
echo "[playbook-smoke] boosted search (use_playbook=true):"
|
||
BOOSTED="$(curl -sS -X POST http://127.0.0.1:3110/v1/matrix/search \
|
||
-H 'Content-Type: application/json' \
|
||
-d "$(jq -n --arg q "$QUERY_TEXT" \
|
||
'{query_text:$q, corpora:["widgets"], k:3, use_playbook:true, playbook_max_distance:0.5}')")"
|
||
BOOST_ORDER="$(echo "$BOOSTED" | jq -r '[.results[].id] | join(",")')"
|
||
BOOST_C_DIST="$(echo "$BOOSTED" | jq -r '[.results[] | select(.id=="widget-c")] | .[0].distance // -1')"
|
||
PB_BOOSTED="$(echo "$BOOSTED" | jq -r '.playbook_boosted // 0')"
|
||
echo " boosted order: $BOOST_ORDER widget-c distance=$BOOST_C_DIST playbook_boosted=$PB_BOOSTED"
|
||
|
||
# ── Assertion 1: PlaybookBoosted >= 1 ────────────────────────────
|
||
if [ "$PB_BOOSTED" -ge 1 ]; then
|
||
echo " ✓ playbook_boosted=$PB_BOOSTED ≥ 1"
|
||
else
|
||
echo " ✗ playbook_boosted=$PB_BOOSTED (expected ≥ 1)"; FAILED=1
|
||
fi
|
||
|
||
# ── Assertion 2: widget-c distance halved (score=1.0 → 0.5× factor)
|
||
# Allow some tolerance because the query and recorded query may not
|
||
# be byte-identical depending on Ollama's tokenization stability.
|
||
RATIO="$(awk -v b="$BASE_C_DIST" -v c="$BOOST_C_DIST" 'BEGIN{ if (b<=0) print -1; else print c/b }')"
|
||
echo " widget-c distance ratio (boosted/baseline) = $RATIO (expect ≈ 0.5)"
|
||
WITHIN="$(awk -v r="$RATIO" 'BEGIN{ print (r>=0.40 && r<=0.60) ? "true" : "false" }')"
|
||
if [ "$WITHIN" = "true" ]; then
|
||
echo " ✓ ratio in [0.40, 0.60] — boost applied correctly"
|
||
else
|
||
echo " ✗ ratio out of band: $RATIO"; FAILED=1
|
||
fi
|
||
|
||
if [ "$FAILED" -eq 0 ]; then
|
||
echo "[playbook-smoke] Playbook acceptance gate: PASSED"
|
||
exit 0
|
||
else
|
||
echo "[playbook-smoke] Playbook acceptance gate: FAILED"
|
||
exit 1
|
||
fi
|