golangLAKEHOUSE/scripts/playbook_smoke.sh
root 06e71520c4 matrix: playbook memory + boost — SPEC §3.4 component 5 of 5 (LEARNING LOOP)
Closes SPEC §3.4. The matrix indexer is now a learning meta-index per
feedback_meta_index_vision.md — every successful (query → answer)
pair recorded via /matrix/playbooks/record boosts that answer for
future similar queries.

This is the architectural piece that lifts vectord from "static
hybrid search" to the meta-index J originally framed in Phase 19 of
the Rust system.

What's new:
  - internal/matrix/playbook.go — PlaybookEntry, PlaybookHit,
    ApplyPlaybookBoost. Pure-function boost math:
      distance' = distance * (1 - 0.5 * score)
    Score 0 = no boost (factor 1.0); score 1 = halve distance
    (factor 0.5). Capped at 0.5 deliberately so a single high-
    confidence playbook can't dominate the base ranking forever
    (runaway-feedback-loop guard).
  - Retriever.Record(entry, corpus) — embeds query_text, ensures
    playbook corpus exists (idempotent), upserts via deterministic
    sha256-derived ID (last score wins on re-record of same triple).
  - Retriever.Search extended with UsePlaybook + PlaybookCorpus +
    PlaybookTopK + PlaybookMaxDistance. Reuses the query vector —
    no extra embed call. Missing-corpus 404 = no-op (cold-start
    state before any Record call), not an error.
  - POST /v1/matrix/playbooks/record (matrixd) — caller submits
    {query_text, answer_id, answer_corpus, score, tags?}; gets
    {playbook_id} back.

Storage: a vectord index named "playbook_memory" (configurable per
request) with embed(query_text) as the vector and the
PlaybookEntry JSON as metadata. Just another corpus — observable
from /vectors/index, persistable through G1P, etc.

Match key for boost: (AnswerID, AnswerCorpus). Cross-corpus ID
collisions don't false-match — verified by
TestApplyPlaybookBoost_CorpusAttributionRespected.

End-to-end smoke (scripts/playbook_smoke.sh, all assertions PASS):
  - Baseline search: widget-c at distance 0.6566 (rank 3)
  - Record playbook: query → widget-c, score=1.0
  - Re-search with use_playbook=true:
      widget-c distance: 0.3283 (rank 2)
      ratio: 0.5 EXACTLY (matches boost math precisely)
      playbook_boosted: 1
  - widget-c jumped from #3 to #2 — learning loop visible

Tests:
  - 8 unit tests in internal/matrix/playbook_test.go covering
    Validate, BoostFactor (5 cases), the no-boost identity, the
    boost-moves-result-up scenario, highest-score wins on duplicate
    matches, cross-corpus attribution, JSON round-trip, and
    rejection of empty metadata
  - scripts/playbook_smoke.sh integration test (3 assertions PASS)

15-smoke regression sweep all green (D1-D6, G1, G1P, G2,
storaged_cap, pathway, matrix, relevance, downgrade, playbook).

SPEC §3.4 NOW COMPLETE: 5 of 5 components shipped. The matrix
indexer's port is done as a substrate; remaining work is operational
(rating signal sources, telemetry, eventual structured filtering for
staffing data — none in §3.4).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 19:34:24 -05:00

176 lines
7.1 KiB
Bash
Executable File
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#!/usr/bin/env bash
# Playbook smoke — learning-loop integration end-to-end.
# All assertions go through gateway :3110.
#
# Validates the full boost cycle:
# 1. Build a test corpus with 3 items
# 2. Query → get baseline ranking
# 3. Record a playbook: query → bottom-ranked answer with score=1.0
# 4. Re-query with use_playbook=true
# 5. Assert: the recorded answer's distance ≈ 0.5 × baseline (boost
# math: distance' = distance × (1 - 0.5×score))
# 6. Assert: PlaybookBoosted >= 1 in the response
#
# Requires Ollama on :11434 with nomic-embed-text loaded — Record
# embeds the query_text. Skips (exit 0) when Ollama is absent.
set -euo pipefail
cd "$(dirname "$0")/.."
export PATH="$PATH:/usr/local/go/bin"
if ! curl -sS --max-time 3 http://localhost:11434/api/tags >/dev/null 2>&1; then
echo "[playbook-smoke] Ollama not reachable on :11434 — skipping"
exit 0
fi
echo "[playbook-smoke] building stack..."
go build -o bin/ ./cmd/embedd ./cmd/vectord ./cmd/matrixd ./cmd/gateway
pkill -f "bin/(embedd|vectord|matrixd|gateway)" 2>/dev/null || true
sleep 0.3
PIDS=()
TMP="$(mktemp -d)"
CFG="$TMP/playbook.toml"
cleanup() {
echo "[playbook-smoke] cleanup"
for p in "${PIDS[@]}"; do [ -n "$p" ] && kill "$p" 2>/dev/null || true; done
rm -rf "$TMP"
}
trap cleanup EXIT INT TERM
cat > "$CFG" <<EOF
[gateway]
bind = "127.0.0.1:3110"
storaged_url = "http://127.0.0.1:3211"
catalogd_url = "http://127.0.0.1:3212"
ingestd_url = "http://127.0.0.1:3213"
queryd_url = "http://127.0.0.1:3214"
vectord_url = "http://127.0.0.1:3215"
embedd_url = "http://127.0.0.1:3216"
pathwayd_url = "http://127.0.0.1:3217"
matrixd_url = "http://127.0.0.1:3218"
[vectord]
bind = "127.0.0.1:3215"
storaged_url = ""
[matrixd]
bind = "127.0.0.1:3218"
embedd_url = "http://127.0.0.1:3216"
vectord_url = "http://127.0.0.1:3215"
EOF
poll_health() {
local port="$1" deadline=$(($(date +%s) + 5))
while [ "$(date +%s)" -lt "$deadline" ]; do
if curl -sS --max-time 1 "http://127.0.0.1:$port/health" >/dev/null 2>&1; then return 0; fi
sleep 0.05
done
return 1
}
echo "[playbook-smoke] launching embedd → vectord → matrixd → gateway..."
./bin/embedd -config "$CFG" > /tmp/embedd.log 2>&1 & PIDS+=($!)
poll_health 3216 || { echo "embedd failed"; tail /tmp/embedd.log; exit 1; }
./bin/vectord -config "$CFG" > /tmp/vectord.log 2>&1 & PIDS+=($!)
poll_health 3215 || { echo "vectord failed"; tail /tmp/vectord.log; exit 1; }
./bin/matrixd -config "$CFG" > /tmp/matrixd.log 2>&1 & PIDS+=($!)
poll_health 3218 || { echo "matrixd failed"; tail /tmp/matrixd.log; exit 1; }
./bin/gateway -config "$CFG" > /tmp/gateway.log 2>&1 & PIDS+=($!)
poll_health 3110 || { echo "gateway failed"; tail /tmp/gateway.log; exit 1; }
FAILED=0
# Embed three corpus items + the query, all via /v1/embed.
echo "[playbook-smoke] embedding 3 corpus items + query..."
EMBEDS="$(curl -sS -X POST http://127.0.0.1:3110/v1/embed \
-H 'Content-Type: application/json' \
-d '{"texts":["alpha staffing query test","bravo distinct content","charlie unrelated topic","alpha staffing query test full prompt"]}')"
V_A="$(echo "$EMBEDS" | jq -c '.vectors[0]')"
V_B="$(echo "$EMBEDS" | jq -c '.vectors[1]')"
V_C="$(echo "$EMBEDS" | jq -c '.vectors[2]')"
V_Q="$(echo "$EMBEDS" | jq -c '.vectors[3]')"
# Build corpus
echo "[playbook-smoke] create corpus widgets + add 3 items..."
curl -sS -o /dev/null -X POST http://127.0.0.1:3110/v1/vectors/index \
-H 'Content-Type: application/json' \
-d '{"name":"widgets","dimension":768,"distance":"cosine"}'
curl -sS -o /dev/null -X POST http://127.0.0.1:3110/v1/vectors/index/widgets/add \
-H 'Content-Type: application/json' \
-d "$(jq -n --argjson va "$V_A" --argjson vb "$V_B" --argjson vc "$V_C" \
'{items:[
{id:"widget-a", vector:$va, metadata:{label:"a"}},
{id:"widget-b", vector:$vb, metadata:{label:"b"}},
{id:"widget-c", vector:$vc, metadata:{label:"c"}}
]}')"
# Baseline matrix search (no playbook) — using query_vector to skip
# embedd round-trip and keep the test deterministic on the geometry
# we know.
echo "[playbook-smoke] baseline search (no playbook):"
BASELINE="$(curl -sS -X POST http://127.0.0.1:3110/v1/matrix/search \
-H 'Content-Type: application/json' \
-d "$(jq -n --argjson v "$V_Q" '{query_vector:$v, corpora:["widgets"], k:3}')")"
BASE_ORDER="$(echo "$BASELINE" | jq -r '[.results[].id] | join(",")')"
BASE_C_DIST="$(echo "$BASELINE" | jq -r '[.results[] | select(.id=="widget-c")] | .[0].distance // -1')"
echo " baseline order: $BASE_ORDER widget-c distance=$BASE_C_DIST"
# Record a playbook entry for the query → widget-c (use the same
# query_text that the playbook will be re-queried by, exact match).
QUERY_TEXT="alpha staffing query test full prompt"
echo "[playbook-smoke] record playbook: ($QUERY_TEXT) → widget-c score=1.0"
RECORD_RESP="$(curl -sS -X POST http://127.0.0.1:3110/v1/matrix/playbooks/record \
-H 'Content-Type: application/json' \
-d "$(jq -n --arg q "$QUERY_TEXT" \
'{query_text:$q, answer_id:"widget-c", answer_corpus:"widgets", score:1.0, tags:["smoke"]}')")"
PB_ID="$(echo "$RECORD_RESP" | jq -r '.playbook_id // empty')"
if [ -z "$PB_ID" ]; then
echo " ✗ no playbook_id in response: $RECORD_RESP"; FAILED=1
else
echo " ✓ playbook_id=$PB_ID"
fi
# Re-search with use_playbook=true. Use query_text so matrixd embeds
# it again (proves end-to-end). The newly-recorded playbook entry has
# the SAME query_text → cosine distance ~0 → boost applies to widget-c.
echo "[playbook-smoke] boosted search (use_playbook=true):"
BOOSTED="$(curl -sS -X POST http://127.0.0.1:3110/v1/matrix/search \
-H 'Content-Type: application/json' \
-d "$(jq -n --arg q "$QUERY_TEXT" \
'{query_text:$q, corpora:["widgets"], k:3, use_playbook:true, playbook_max_distance:0.5}')")"
BOOST_ORDER="$(echo "$BOOSTED" | jq -r '[.results[].id] | join(",")')"
BOOST_C_DIST="$(echo "$BOOSTED" | jq -r '[.results[] | select(.id=="widget-c")] | .[0].distance // -1')"
PB_BOOSTED="$(echo "$BOOSTED" | jq -r '.playbook_boosted // 0')"
echo " boosted order: $BOOST_ORDER widget-c distance=$BOOST_C_DIST playbook_boosted=$PB_BOOSTED"
# ── Assertion 1: PlaybookBoosted >= 1 ────────────────────────────
if [ "$PB_BOOSTED" -ge 1 ]; then
echo " ✓ playbook_boosted=$PB_BOOSTED ≥ 1"
else
echo " ✗ playbook_boosted=$PB_BOOSTED (expected ≥ 1)"; FAILED=1
fi
# ── Assertion 2: widget-c distance halved (score=1.0 → 0.5× factor)
# Allow some tolerance because the query and recorded query may not
# be byte-identical depending on Ollama's tokenization stability.
RATIO="$(awk -v b="$BASE_C_DIST" -v c="$BOOST_C_DIST" 'BEGIN{ if (b<=0) print -1; else print c/b }')"
echo " widget-c distance ratio (boosted/baseline) = $RATIO (expect ≈ 0.5)"
WITHIN="$(awk -v r="$RATIO" 'BEGIN{ print (r>=0.40 && r<=0.60) ? "true" : "false" }')"
if [ "$WITHIN" = "true" ]; then
echo " ✓ ratio in [0.40, 0.60] — boost applied correctly"
else
echo " ✗ ratio out of band: $RATIO"; FAILED=1
fi
if [ "$FAILED" -eq 0 ]; then
echo "[playbook-smoke] Playbook acceptance gate: PASSED"
exit 0
else
echo "[playbook-smoke] Playbook acceptance gate: FAILED"
exit 1
fi