From 3fb3a60da41dbc59e7b19260462291711d0d1174 Mon Sep 17 00:00:00 2001 From: root Date: Tue, 21 Apr 2026 00:03:06 -0500 Subject: [PATCH] =?UTF-8?q?Spec=20ch6=20rewrite=20=E2=80=94=203=20learning?= =?UTF-8?q?=20paths=20=E2=86=92=207=20+=20honest=20gap=20list?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit J flagged the spec out of alignment with what's built. Ch6 now reflects the full current architecture: - Path 1 (playbook boost) — formula kept; geo+role prefilter refinement called out with measured 14× citation lift - Path 2 (pattern discovery) — unchanged - Path 3 (autotune agent) — unchanged - Path 4 (KB + pathway recommender) — Phase 22, file layout documented - Path 5 (cloud rescue on failure) — Phase 22 item B, verified stress_01 example cited - Path 6 (staffer competence-weighted retrieval) — Phase 23, competence_score formula included, cross-staffer auto- discovered worker labels (Rachel D. Lewis 18× endorsements) - Path 7 (observer outcome ingest) — Phase 24, :3800 HTTP listener + ops.jsonl append journal Input normalizer + unified /memory/query surface documented as the "seamless with whatever input" answer, with the 319ms natural-language latency number. Honest gaps kept visible in the spec itself, not hidden: - Zep validity windows (most load-bearing remaining) - Mem0 UPDATE/DELETE/NOOP ops - Letta working-memory hot cache Live at https://devop.live/lakehouse/spec#ch6 after service restart. Verified post-deploy: geo+role prefilter, 14× delta, validity windows gap all present in served HTML. --- mcp-server/spec.html | 48 ++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 44 insertions(+), 4 deletions(-) diff --git a/mcp-server/spec.html b/mcp-server/spec.html index 1b38c86..9b81834 100644 --- a/mcp-server/spec.html +++ b/mcp-server/spec.html @@ -251,19 +251,59 @@ table.plain tr:hover td{background:#0d1117}
Chapter 6

How it gets better over time

-
Compounding learning in three paths — all three happen automatically, no operator intervention required.
+
Compounding learning across seven paths. The first three are automatic background loops. Paths 4-7 landed 2026-04-21 and turn the system into a reinforcement-learning pipeline: outcomes → knowledge base → pathway recommendations → cloud rescue → competence-weighted retrieval → observer analysis. All seven happen without operator intervention.
-

Path 1 — Playbook boost (Phase 19)

-

Every sealed fill is seeded to playbook_memory via /vectors/playbook_memory/seed. The next hybrid query for a semantically similar role+geo surfaces the past endorsed workers with a boost. Math:

+

Path 1 — Playbook boost with geo + role prefilter (Phase 19 + refinement)

+

Every sealed fill is seeded to playbook_memory. The boost fires inside /vectors/hybrid when use_playbook_memory: true. Math, tightened 2026-04-21 after a diagnostic pass found globally-ranked playbooks were missing the SQL-filtered candidate pool entirely:

per_worker = cosine(query_emb, playbook_emb) × 0.5 × e^(-age/30) × 0.5^failures / n_workers
 boost[(city, state, name)] = min(Σ per_worker, 0.25)
-

Caps, decay, and negative signal mean one popular worker can't dominate, old playbooks fade, and no-shows stop boosting. Verified live: 3 identical seeds → +0.250 boost capped, 3 citations.

+

Multi-strategy retrieval (new): before cosine, compute_boost_for_filtered_with_role(target_geo, target_role) prefilters to same-city playbooks, then gives exact (role, city, state) matches similarity=1.0 and fills up to half the top-k. Cosine fills the rest. Mirrors 2026 Mem0/Zep guidance on parallel-strategy rerank.

+

Measured lift: before geo-filter, Nashville Welder query returned boosts=170 matched=0 (zero intersection with candidate pool). After: boosts=36 matched=11. On the Riverfront Steel scenario, total playbook citations went from 2 → 28 per run — a 14× delta on identical inputs. The diagnostic log playbook_boost: boosts=N sources=N parsed=N matched=N target_geo=? target_role=? runs on every hybrid call so the class of silent-miss bug stays visible.

Path 2 — Pattern discovery (meta-index)

/vectors/playbook_memory/patterns goes beyond "who was endorsed" to answer "what did past similar fills have in common?" Aggregates recurring certifications, skills, archetype, reliability distribution across the top-K semantically similar playbooks. Surfaces signal the operator didn't explicitly query for.

Path 3 — Autotune agent

The vectord::agent background task runs continuously. Watches the HNSW trial journal, proposes configs, executes trials, promotes Pareto winners — without human intervention. Operator sees "the index got faster overnight" and doesn't know why. The journal knows why.

+ +

Path 4 — Knowledge Base + pathway recommender (Phase 22)

+

Meta-layer over playbook_memory. Files under data/_kb/:

+ +

Cycle: scenario ends → kb.indexRun() appends outcome → kb.recommendFor(nextSpec) finds k-NN signatures, feeds outcome history to an overview model, writes structured JSON advice → next scenario reads it via kb.loadRecommendation(spec) and injects pathway_notes into the executor's context alongside prior T3 lessons.

+ +

Path 5 — Cloud rescue on failure (Phase 22 item B)

+

When an event fails (drift abort, JSON parse, pool exhaustion) and cloud T3 is enabled, requestCloudRemediation() feeds the full failure trace — SQL filters attempted, row counts, reviewer drift notes, gap signals, contract terms — to gpt-oss:120b on Ollama Cloud. Cloud returns structured {retry, new_city, new_state, new_role, new_count, rationale}. Event retries once with the pivot. Verified on stress_01: Gary IN (zero workers indexed) misplacement → cloud proposed South Bend IN → retry filled 1/1.

+ +

Path 6 — Staffer competence-weighted retrieval (Phase 23)

+

Answers "who handled this" as a first-class matrix-index dimension. Each scenario carries staffer: {id, name, tenure_months, role, tool_level}. After every run, recomputeStafferStats(staffer_id) aggregates their fill_rate, turn efficiency, citation density, rescue rate into a single competence_score (0.45·fill + 0.20·turn_eff + 0.20·cites + 0.15·rescue).

+

findNeighbors returns weighted_score = cosine × max_staffer_competence — top-performer playbooks rank above juniors' on similar scenarios. Auto-discovery emerges: running 4 staffers × 3 contracts × 3 rounds surfaced Rachel D. Lewis (Welder Nashville) with 18 endorsements across all 4 staffers, Angela U. Ward (Machine Op Indianapolis) with 19 — reliable-performer labels the system built without human tagging.

+ +

Path 7 — Observer outcome ingest (Phase 24)

+

Observer runs as lakehouse-observer.service, now with an HTTP listener on :3800. Scenarios POST per-event outcomes to /event with full provenance (staffer_id, sig_hash, event_kind, role, city, state, rescue flags). Observer's ERROR_ANALYZER and PLAYBOOK_BUILDER loops consume them alongside MCP-wrapped ops. Persistence switched from the old /ingest/file REPLACE path to an append-only data/_observer/ops.jsonl journal so the trace survives across restarts.

+ +

Input normalizer + unified memory query

+

Two surfaces added 2026-04-21 to make the memory stack respond coherently to any input shape:

+ + +

Honest gaps — what we can still implement

+

Three of the five 2026-era memory findings remain unwired. Flagged for near-term implementation, not hidden:

+ +

Validity windows is next — preserves the trust signal (boost only fires on playbooks that are still true given the current schema) rather than the latency signal (which the current scale doesn't need yet).

+ +
Code: crates/vectord/src/{playbook_memory.rs, service.rs} · tests/multi-agent/{kb.ts, memory_query.ts, normalize.ts, scenario.ts} · mcp-server/{observer.ts, index.ts} · data/_kb/ · data/_observer/