Chapter 3
Dual-agent recursive consensus loop
The system we use to execute staffing fills is a dual-agent recursive protocol. Two agents with distinct roles iterate against a shared log until one of three terminal states is reached. It is deterministic in structure, stochastic in content, and verifiable through the per-run log artifact.
Agents and protocol
task in
│
▼
┌───────────────────────────────────────────────────────────┐
│ EXECUTOR (mistral:latest) │
│ ──────────────────────────────────────────────────────── │
│ input: task spec + shared log + seen-candidates ledger │
│ output: one JSON action per turn │
│ · {kind:"plan",steps:[…]} │
│ · {kind:"tool_call",tool,args,rationale} │
│ · {kind:"propose_done",fills:[N of N]} │
└───────────┬───────────────────────────────┬───────────────┘
│ tool_call │ propose_done
▼ │
┌──────────────────────────┐ │
│ TOOL DISPATCH │ │
│ hybrid_search / sql │ │
│ (against live gateway) │ │
└──────────┬───────────────┘ │
│ result (trimmed, exclusions) │
▼ ▼
┌───────────────────────────────────────────────────────────┐
│ REVIEWER (qwen2.5:latest) │
│ ──────────────────────────────────────────────────────── │
│ input: task spec + shared log (including tool result) │
│ output: {kind:"critique",verdict:"continue|drift| │
│ approve_done",notes} │
└───────────┬───────────────────────────────────────────────┘
│
┌─────┴─────┐
▼ ▼ ▼
continue drift approve_done + propose_done ⟹ SEAL
(next turn) (cap ≈ 3 →
hard abort)
Code: tests/multi-agent/agent.ts (protocol + prompts) · tests/multi-agent/orchestrator.ts (run loop) · tests/multi-agent/scenario.ts (5-event warehouse week)
Why "dual" — role specialization
The executor is an optimist. Its job is to produce progress: pull candidates, verify SQL, propose consensus. It's instructed to be decisive.
The reviewer is a pessimist. Its job is to catch drift: proposals that don't match the task's geography, fill count, or role. It's authorized to stop the loop.
This adversarial separation is cheaper and more deterministic than asking a single model to self-critique. The reviewer has a hard rule: on the turn after a propose_done, it MUST emit either approve_done or drift — it cannot stall with continue.
Why "parallel" — orchestrator can fan out
Independent pairs run concurrently. tests/multi-agent/run_e2e_rated.ts runs two task-specific agent pairs via Promise.all. Ollama serializes inference at the model level, so "parallel" is concurrent orchestration — but the substrate (gateway, queryd, vectord) handles concurrent requests cleanly. Verified in the scenario harness: two contracts sealing simultaneously.
Why "recursive" — each seal feeds the next
Consensus does not end at the sealed playbook. Every sealed playbook is persisted to playbook_memory via POST /vectors/playbook_memory/seed. The next hybrid search for a semantically similar operation consults that memory via compute_boost_for(query_embedding, top_k, base_weight) and re-ranks the candidate pool. The system builds on itself turn over turn, playbook over playbook.
Termination guarantees
// three paths out, every run has one of these:
sealed = executor.propose_done ∧ reviewer.approve_done ∧ fills.count == target
abort = consecutive_tool_errors ≥ MAX_TOOL_ERRORS (3) // executor can't form a valid call
abort = consecutive_drifts ≥ MAX_CONSECUTIVE_DRIFTS (3) // reviewer keeps flagging
abort = turn > MAX_TURNS (12) // no consensus reached in window
Every abort dumps the full log to tests/multi-agent/playbooks/<id>-FAILED.json for forensic review. No consensus is ever implicit.
Chapter 4
Playbook memory — the compounding feedback loop
A CRM stores events. This system turns events into re-ranking signal. Every sealed playbook endorses specific (worker, city, state) tuples. Every failure penalizes them. Every similar future query inherits the signal through cosine similarity.
Seed shape
PlaybookEntry {
playbook_id, // pb-seed-<sha8>
operation, // "fill: Welder x2 in Toledo, OH"
approach, context, // short canonical — long strings dilute embedding
timestamp, // RFC3339
endorsed_names[], // validated against workers_500k for city+state
city, state, // parsed from operation
embedding // 768-d nomic-embed-text of text shape
}
Code: crates/vectord/src/playbook_memory.rs (PlaybookEntry, FailureRecord, PlaybookMemoryState)
Boost math (positive + decay + negative)
// For each playbook pb among top-K most cosine-similar:
// given query embedding qv, constant base_weight, n_workers = |pb.endorsed_names|
similarity = cosine(qv, pb.embedding) // skip if ≤ 0.05
age_days = (now - pb.timestamp) / 86_400 seconds
decay = e-age_days / 30 // half-life = 30 days
// For each endorsed worker in pb:
key = (pb.city, pb.state, name)
fail_count = failures[key] // # times this worker was marked no-show for same geo
penalty = 0.5min(fail_count, 20)
per_worker = similarity × base_weight × decay × penalty / n_workers
boost[key] = min(boost[key] + per_worker, MAX_BOOST_PER_WORKER)
// MAX_BOOST_PER_WORKER = 0.25 — cap stops one popular worker from always winning
Code: crates/vectord/src/playbook_memory.rs::compute_boost_for · constants: MAX_BOOST_PER_WORKER, DEFAULT_TOP_K_PLAYBOOKS, BOOST_HALF_LIFE_DAYS
Application at query time
// In /vectors/hybrid handler (crates/vectord/src/service.rs):
1. SQL filter narrows workers_500k to geo/role/availability
2. Vector index returns top_k × 5 candidates by cosine to question
3. compute_boost_for(qv, k=200) returns boost map
4. For each candidate: parse (name, city, state) from chunk, look up boost, add to score
5. Re-sort sources by boosted score
6. Truncate to requested top_k, return with playbook_boost and playbook_citations
Why k=200. Direct measurement showed cosine similarity clusters in the 0.55-0.67 band across all playbooks regardless of geo (nomic-embed-text has narrow discrimination on this kind of structured operation text). A k of 25 silently missed geo-matched playbooks. k=200 is the measured floor for reliably catching compounding. Brute-force over 200 × 768-d is sub-ms even on this hardware.
Evidence: Chicago Electrician compounding test 2026-04-20 — Carmen Green, Anna Patel, Fatima Wilson went from rank >5 / boost 0 / 0 citations (run 0, no seed) to rank 1/2/3 / boost +0.250 (capped) / 3 citations each (run 3, after 3 identical seeds). Each seed increments citations; total boost caps at 0.25/worker.
Write-through to SQL
successful_playbooks_live is a DataFusion-queryable Parquet surface maintained by POST /vectors/playbook_memory/persist_sql. Every /log from the recruiter UI triggers seed → persist_sql. The in-memory store and the SQL surface stay synchronized (full snapshot on each persist, safe because memory is source of truth).
Code: crates/vectord/src/playbook_memory.rs::persist_to_sql · catalog-registered under "successful_playbooks_live"
Pattern discovery (Path 2 — meta-index)
Beyond "who was endorsed." POST /vectors/playbook_memory/patterns takes a query, finds top-K similar past playbooks, pulls each endorsed worker's full workers_500k profile, and aggregates shared traits: recurring certifications, skill frequencies, modal archetype, reliability distribution. Returns a discovered_pattern string showing operator-actionable signal the user didn't explicitly query for.
Code: crates/vectord/src/playbook_memory.rs::discover_patterns · Surfaces: /vectors/playbook_memory/patterns endpoint, /intelligence/chat response, /intelligence/permit_contracts cards
Chapter 7
Verify or dispute — reproduce it yourself
Every claim below is a curl away from falsification.
Health. Should return lakehouse ok.
curl http://localhost:3100/health
Any SQL on multi-million-row Parquet. Sub-100ms typical.
curl -s -X POST http://localhost:3100/query/sql \
-H 'Content-Type: application/json' \
-d '{"sql":"SELECT role, COUNT(*) FROM workers_500k WHERE state=\"IL\" GROUP BY role LIMIT 5"}'
Hybrid search with playbook boost. The whole Phase 19 feedback loop in one request.
curl -s -X POST http://localhost:3100/vectors/hybrid \
-H 'Content-Type: application/json' \
-d '{"index_name":"workers_500k_v1",
"sql_filter":"role = '\''Forklift Operator'\'' AND city = '\''Chicago'\'' AND CAST(availability AS DOUBLE) > 0.5",
"question":"reliable forklift operator",
"top_k":5,"use_playbook_memory":true,"playbook_memory_k":200}'
Playbook memory stats. Count + endorsed names + sample.
curl http://localhost:3100/vectors/playbook_memory/stats
Pattern discovery. What do past similar fills have in common?
curl -s -X POST http://localhost:3100/vectors/playbook_memory/patterns \
-H 'Content-Type: application/json' \
-d '{"query":"Forklift Operator in Chicago, IL","top_k_playbooks":25,"min_trait_frequency":0.3}'
Run the dual-agent scenario yourself. All 5 events, real fills, real artifacts.
cd /home/profit/lakehouse
bun run tests/multi-agent/scenario.ts
# Output: tests/multi-agent/playbooks/scenario-<timestamp>/report.md