Chapter 2
-
Architecture — 13 crates, one object store, one local AI runtime
-
Request flows top to bottom. Every node is independently swappable. Every line is a real HTTP or gRPC hop that you can trace with tcpdump.
+
Architecture — 15 crates, one object store, a 5-provider model fleet
+
Gateway is a drop-in OpenAI-compatible middleware. Any consumer that speaks the OpenAI Chat Completions shape — agent SDKs, IDE plugins, custom apps — points at localhost:3100/v1 and gets routing, audit, and the full memory substrate behind every call. The model side has 5 providers and 40+ frontier models reachable via one OpenCode key. The data side stays Rust-first.
-
HTTP :3100 + gRPC :3101
- │
- ┌───────▼───────┐
- │ gateway │ Rust · Axum · routing, CORS, auth, tools
- └───────┬───────┘
- ┌────────────┬───────────┼───────────┬────────────┐
- │ │ │ │ │
- ┌────▼───┐ ┌────▼───┐ ┌────▼───┐ ┌────▼───┐ ┌────▼───┐
- │catalog │ │ query │ │ vector │ │ ingest │ │aibridge│
- │ d │ │ d │ │ d │ │ d │ │ │
- └────┬───┘ └────┬───┘ └────┬───┘ └────┬───┘ └────┬───┘
- │ │ │ │ │
- └────────────┴───────────┼───────────┴────────────┘
- ▼
- ┌─────────────────┐
- │ object storage │ Parquet files (local / S3)
- └─────────────────┘
- ▲
- │
- ┌───────┴────────┐
- │ Python sidecar │ FastAPI → Ollama
- │ (aibridge) │ local models only
- └────────────────┘
+
OpenAI SDK consumers MCP clients Browser UI (Bun :3700)
+ │ │ │
+ └──────────────────────────┼──────────────────────────┘
+ ▼
+ ┌──────────────────────────────┐
+ │ gateway :3100 /v1/* │ Rust · Axum
+ │ OpenAI-compat drop-in │ smart provider routing
+ │ /v1/chat /v1/mode /iterate │ cost telemetry, Langfuse
+ └──────────┬───────────────────┘
+ ┌─────────┬───────────────┼───────────────┬──────────┐
+ │ │ │ │ │
+ ┌────▼───┐ ┌───▼────┐ ┌─────▼──────┐ ┌─────▼─────┐ ┌──▼──────┐
+ │catalog │ │ query │ │ vector │ │ ingest │ │aibridge │
+ │ d │ │ d │ │ d │ │ d │ │ │
+ │idempot │ │DataFus │ │HNSW · Lance│ │CSV PDF SQL│ │provider │
+ │schema │ │delta │ │playbook+ │ │auto-PII │ │adapters │
+ │fingerp │ │MemTabl │ │pathway mem │ │schema fp │ │5 active │
+ └────┬───┘ └───┬────┘ └─────┬──────┘ └─────┬─────┘ └──┬──────┘
+ └─────────┴────────────────┼────────────────┴─────────┘
+ ▼
+ ┌──────────────────┐
+ │ object storage │ Parquet · MinIO · S3-compat
+ └──────────────────┘
+ ▲
+ │
+ ┌───────────────┴────────────────┐
+ │ validator · journald │ schema/PII/policy gates
+ │ (Phase 43) · (audit log) │ + append-only mutations
+ └────────────────────────────────┘
+
+Provider fleet (config/providers.toml):
+ ollama localhost:3200 local Ollama → qwen3.5, gemma2
+ ollama_cloud ollama.com gpt-oss:120b, qwen3-coder:480b,
+ deepseek-v3.1:671b, kimi-k2:1t,
+ mistral-large-3:675b, qwen3.5:397b
+ openrouter openrouter.ai/api/v1 343 models — paid + free rescue
+ opencode opencode.ai/zen/v1 40 models · ONE sk-* key reaches
+ Claude Opus 4.7, GPT-5.5-pro,
+ Gemini 3.1-pro, Kimi K2.6, GLM 5.1,
+ DeepSeek, Qwen, MiniMax, free tier
+ kimi api.kimi.com/coding/v1 direct Kimi For Coding (TOS-clean)
-
Per-crate responsibility
+
Per-crate responsibility (15 crates)
| Crate | Role | Path |
- | shared | Types, errors, Arrow helpers, PII detection, secrets provider | crates/shared/ |
- | storaged | object_store I/O, BucketRegistry (multi-bucket), AppendLog, ErrorJournal | crates/storaged/ |
- | catalogd | Metadata authority — manifests, views, tombstones, profiles, schema fingerprints | crates/catalogd/ |
- | queryd | DataFusion SQL engine, MemTable cache, delta merge-on-read, compaction | crates/queryd/ |
- | ingestd | CSV/JSON/PDF(+OCR)/Postgres/MySQL ingest, cron schedules, auto-PII | crates/ingestd/ |
- | vectord | Embeddings as Parquet, HNSW, trial system, autotune agent, playbook_memory | crates/vectord/ |
+ | shared | Types, errors, Arrow helpers, PII detection, secrets provider, model_matrix | crates/shared/ |
+ | storaged | object_store I/O, BucketRegistry, AppendLog, ErrorJournal, federation_service | crates/storaged/ |
+ | catalogd | Manifests, views (incl. PII-safe view layer), tombstones, profiles, schema fingerprints, register-idempotency (ADR-020) | crates/catalogd/ |
+ | queryd | DataFusion SQL, MemTable cache, delta merge-on-read, compaction, truth gate (ADR-021) | crates/queryd/ |
+ | ingestd | CSV/JSON/PDF(+OCR)/Postgres/MySQL ingest, cron schedules, auto-PII flagging | crates/ingestd/ |
+ | vectord | Embeddings as Parquet, HNSW, trial system, autotune, playbook_memory + pathway_memory (ADR-021 semantic-correctness layer) | crates/vectord/ |
| vectord-lance | Firewall crate — Lance 4.0 + Arrow 57 isolated from main Arrow 55 | crates/vectord-lance/ |
- | journald | Append-only mutation event log for time-travel & audit | crates/journald/ |
- | aibridge | Rust↔Python sidecar, Ollama HTTP client, VRAM introspection | crates/aibridge/ |
- | gateway | Axum HTTP :3100 + gRPC :3101, middleware, tools registry | crates/gateway/ |
- | ui | Dioxus WASM internal developer UI | crates/ui/ |
- | mcp-server | Bun TypeScript recruiter-facing app (this server) | mcp-server/ |
+ | journald | Append-only mutation event log for time-travel + audit | crates/journald/ |
+ | truth | File-backed rule store; evaluate(task_class, ctx) → Vec<RuleOutcome> (ADR-021) | crates/truth/ |
+ | aibridge | Rust↔Python sidecar, Ollama client, ProviderAdapter trait, /v1/chat router | crates/aibridge/ |
+ | gateway | Axum HTTP :3100 + gRPC :3101, OpenAI-compat /v1/*, mode runner, validator, iterate loop, cost telemetry, Langfuse + observer fan-out | crates/gateway/ |
+ | validator | Phase 43 — schema / completeness / consistency / policy gates over LLM outputs (FillValidator, EmailValidator, ParquetWorkerLookup) | crates/validator/ |
+ | ui | Dioxus WASM internal developer UI (separate from this Bun-served public UI) | crates/ui/ |
+ | mcp-server | Bun TypeScript public-facing app + MCP tool surface — what you're reading right now | mcp-server/ |
+ | auditor | External claim-vs-diff verifier on PRs · Kimi K2.6 ↔ Haiku 4.5 cross-lineage alternation, Opus 4.7 auto-promote on diffs >100k chars | auditor/ |
-
Source: git.agentview.dev/profit/lakehouse · ADRs: docs/DECISIONS.md (currently 20 records)
+
Source: git.agentview.dev/profit/lakehouse · branch scrum/auto-apply-19814 · tag distillation-v1.0.0 at commit e7636f2 (frozen substrate) · ADRs: docs/DECISIONS.md (currently 21 records)
Chapter 3
-
Dual-agent recursive consensus loop
-
The system we use to execute staffing fills is a dual-agent recursive protocol. Two agents with distinct roles iterate against a shared log until one of three terminal states is reached. It is deterministic in structure, stochastic in content, and verifiable through the per-run log artifact.
-
Agents and protocol
-
-
task in
- │
- ▼
- ┌───────────────────────────────────────────────────────────┐
- │ EXECUTOR (mistral:latest) │
- │ ──────────────────────────────────────────────────────── │
- │ input: task spec + shared log + seen-candidates ledger │
- │ output: one JSON action per turn │
- │ · {kind:"plan",steps:[…]} │
- │ · {kind:"tool_call",tool,args,rationale} │
- │ · {kind:"propose_done",fills:[N of N]} │
- └───────────┬───────────────────────────────┬───────────────┘
- │ tool_call │ propose_done
- ▼ │
- ┌──────────────────────────┐ │
- │ TOOL DISPATCH │ │
- │ hybrid_search / sql │ │
- │ (against live gateway) │ │
- └──────────┬───────────────┘ │
- │ result (trimmed, exclusions) │
- ▼ ▼
- ┌───────────────────────────────────────────────────────────┐
- │ REVIEWER (qwen2.5:latest) │
- │ ──────────────────────────────────────────────────────── │
- │ input: task spec + shared log (including tool result) │
- │ output: {kind:"critique",verdict:"continue|drift| │
- │ approve_done",notes} │
- └───────────┬───────────────────────────────────────────────┘
- │
- ┌─────┴─────┐
- ▼ ▼ ▼
- continue drift approve_done + propose_done ⟹ SEAL
- (next turn) (cap ≈ 3 →
- hard abort)
-
-
-
Code: tests/multi-agent/agent.ts (protocol + prompts) · tests/multi-agent/orchestrator.ts (run loop) · tests/multi-agent/scenario.ts (5-event warehouse week)
+
The model fleet — 9-rung ladder, N=3 consensus, cross-lineage audit
+
No single model owns the answer. Every consequential call is structured: the right tier picks up first, fallback rungs catch what fails, parallel runs vote, and an independent auditor of a different model lineage checks the result against the diff. The protocol is deterministic; the inference is stochastic; every step writes a receipt.
-
Why "dual" — role specialization
-
-
The executor is an optimist. Its job is to produce progress: pull candidates, verify SQL, propose consensus. It's instructed to be decisive.
-
-
The reviewer is a pessimist. Its job is to catch drift: proposals that don't match the task's geography, fill count, or role. It's authorized to stop the loop.
-
- This adversarial separation is cheaper and more deterministic than asking a single model to self-critique. The reviewer has a hard rule: on the turn after a
propose_done, it MUST emit either
approve_done or
drift — it cannot stall with
continue.
+
The 9-rung cloud-first ladder
+
+
request in
+ │
+ ▼
+ ┌───────────────────────────────────────────────────────────────────┐
+ │ attempt 1 ollama_cloud / kimi-k2:1t 1T params · flagship │
+ │ attempt 2 ollama_cloud / qwen3-coder:480b coding specialist │
+ │ attempt 3 ollama_cloud / deepseek-v3.1:671b reasoning │
+ │ attempt 4 ollama_cloud / mistral-large-3:675b deep analysis │
+ │ attempt 5 ollama_cloud / gpt-oss:120b reliable workhorse │
+ │ attempt 6 ollama_cloud / qwen3.5:397b dense final thinker │
+ │ attempt 7 openrouter / openai/gpt-oss-120b:free rescue tier │
+ │ attempt 8 openrouter / google/gemma-3-27b-it:free fastest rescue │
+ │ attempt 9 ollama / qwen3.5:latest last-resort local │
+ └───────────────┬───────────────────────────────────────────────────┘
+ │ isAcceptable() = chars ≥ 3800 ∧ not malformed JSON
+ ▼
+ sealed result OR next-rung learning preamble
+
Every rung sees a learning preamble carrying the prior rejection reason. The ladder is the standard scrum/auditor path; for individual /v1/chat calls the caller picks the model directly (or lets the smart-routing default fire).
+
Code: tests/real-world/scrum_master_pipeline.ts const LADDER · config/routing.toml · crates/gateway/src/v1/mode.rs (mode runner)
-
Why "parallel" — orchestrator can fan out
-
- Independent pairs run concurrently. tests/multi-agent/run_e2e_rated.ts runs two task-specific agent pairs via Promise.all. Ollama serializes inference at the model level, so "parallel" is concurrent orchestration — but the substrate (gateway, queryd, vectord) handles concurrent requests cleanly. Verified in the scenario harness: two contracts sealing simultaneously.
-
-
-
Why "recursive" — each seal feeds the next
-
- Consensus does not end at the sealed playbook. Every sealed playbook is persisted to playbook_memory via POST /vectors/playbook_memory/seed. The next hybrid search for a semantically similar operation consults that memory via compute_boost_for(query_embedding, top_k, base_weight) and re-ranks the candidate pool. The system builds on itself turn over turn, playbook over playbook.
-
-
-
Termination guarantees
+
N=3 consensus + tie-breaker (auditor inference)
- // three paths out, every run has one of these:
- sealed = executor.propose_done ∧ reviewer.approve_done ∧ fills.count == target
- abort = consecutive_tool_errors ≥ MAX_TOOL_ERRORS (3) // executor can't form a valid call
- abort = consecutive_drifts ≥ MAX_CONSECUTIVE_DRIFTS (3) // reviewer keeps flagging
- abort = turn > MAX_TURNS (12) // no consensus reached in window
+ // auditor/checks/inference.ts — every claim audit runs this:
+ 1. Fire the primary reviewer N=3 times in PARALLEL (Promise.all) — wall-clock = single call
+ 2. Aggregate votes per claim_idx · majority wins
+ 3. On 1-1-1 split → tie-breaker model with different architecture (qwen3-coder:480b vs primary gpt-oss/kimi)
+ 4. Every disagreement (even when majority resolves) → data/_kb/audit_discrepancies.jsonl
+
+ // Closes the cloud-non-determinism gap: temp=0 isn't actually deterministic in practice
+ // across hours; consensus + cross-architecture tie-break stabilizes verdicts.
-
Every abort dumps the full log to tests/multi-agent/playbooks/<id>-FAILED.json for forensic review. No consensus is ever implicit.
+
+
Auditor cross-lineage — Kimi ↔ Haiku ↔ Opus
+
Every push to PR #11 triggers auditor/audit.ts within ~90s. To prevent a single model lineage's blind spots from becoming the system's blind spots, audits alternate between Kimi K2.6 (Moonshot) and Haiku 4.5 (Anthropic) by SHA. Diffs over 100k chars auto-promote to Claude Opus 4.7. Per-PR cap of 3 audits with auto-reset on each new head SHA prevents infinite-loop spend. 100% grounding-verified rate on Haiku 4.5 across the latest 10 findings — pairing different lineages + forcing per-finding grounding kills confabulation.
+
Code: auditor/audit.ts · auditor/checks/inference.ts (N=3) · auditor/checks/kimi_architect.ts · Verdicts: data/_auditor/kimi_verdicts/ — read any 11-<sha>.json to inspect a real audit
+
+
Distillation v1.0.0 — the frozen substrate
+
The substrate the auditor and mode runner sit on is tagged at distillation-v1.0.0 / commit e7636f2. 145 unit tests pass · 22/22 acceptance invariants · 16/16 audit-full checks · bit-identical reproducibility verified. The distillation phase exports clean SFT / RAG / preference samples with a multi-layer contamination firewall; the auditor consumes the substrate. The frozen tag means: any future "the system regressed" question has a baseline to bisect against, byte-for-byte.
+
Tag: distillation-v1.0.0 · Commit: e7636f2 · Substrate code: scripts/distillation/ · auditor/schemas/distillation/ · Output: data/_kb/distilled_{facts,procedures,config_hints}.jsonl
Chapter 4
-
Playbook memory — the compounding feedback loop
-
A CRM stores events. This system turns events into re-ranking signal. Every sealed playbook endorses specific (worker, city, state) tuples. Every failure penalizes them. Every similar future query inherits the signal through cosine similarity.
+
Two memory layers — playbook (worker signal) + pathway (system signal)
+
A CRM stores events. This system turns events into re-ranking signal at two layers. Playbook memory compounds worker-level outcomes (who got endorsed, where, when) into per-query boost. Pathway memory compounds system-level outcomes (which model + corpus + framing actually solved similar problems) into per-task hot-swap. Both are queryable. Both are auditable. Both compound.
+
+
Layer 1 — playbook memory (worker + geo signal)
Seed shape
@@ -289,10 +289,82 @@ pre{background:#161b22;border:1px solid #171d27;border-radius:8px;padding:14px 1
Beyond "who was endorsed." POST /vectors/playbook_memory/patterns takes a query, finds top-K similar past playbooks, pulls each endorsed worker's full workers_500k profile, and aggregates shared traits: recurring certifications, skill frequencies, modal archetype, reliability distribution. Returns a discovered_pattern string showing operator-actionable signal the user didn't explicitly query for.
Code: crates/vectord/src/playbook_memory.rs::discover_patterns · Surfaces: /vectors/playbook_memory/patterns endpoint, /intelligence/chat response, /intelligence/permit_contracts cards
+
+
Layer 2 — pathway memory (system-level hot-swap, ADR-021)
+
+ Pathway memory remembers which approach worked, not just which worker. Every accepted scrum review writes a PathwayTrace with the full backtrack: file fingerprint, model used, signal class, KB chunks consulted, observer events, semantic flags, bug fingerprints. A new query that fingerprints to the same trace can hot-swap to the prior result without re-running the 9-rung escalation. The 5-factor hot-swap gate is strict: narrow fingerprint match AND audit consensus pass AND replay_count ≥ 3 (probation) AND success_rate ≥ 0.80 AND NOT retired AND vector cosine ≥ 0.90.
+
+
+ // Live pathway state (refresh page to recompute):
+ — traces · — successful replays · — reuse rate
+ // 88 / 11/11 / 100% as of 2026-04-27 — probation gate crossed
+
+
Code: crates/vectord/src/pathway_memory.rs · Endpoints: /vectors/pathway/insert · /query · /record_replay · /stats · /bug_fingerprints · Spec: docs/DECISIONS.md ADR-021 — Semantic-correctness matrix layer
+
+
What both memory layers feed (besides search)
+
+ Both layers also feed the per-staffer hot-swap index (Chapter 5) and the Construction Activity Signal Engine (Chapter 6). One memory model, surfaced three different ways at the request boundary depending on who's asking.
+
Chapter 5
+
Per-staffer hot-swap — same corpus, different relevance gradient
+
Maria runs Chicago. Devon runs Indianapolis. Aisha runs Wisconsin/Michigan. They share one corpus, but the search results, the recurring-skill patterns, and the playbook context all reshape to whoever is acting. Same query "forklift operators" returns 89 IN workers when Devon's acting, 16 WI when Aisha's, 167 IL when Maria's. The MEMORY panel relabels itself with the active coordinator's name.
+
+
What scopes per staffer
+
+ // On every /intelligence/chat call:
+ if (b.staffer_id) {
+ const staffer = lookupStaffer(b.staffer_id);
+ // 1. Default state filter to staffer territory unless caller pinned one
+ if (!explicitState) filters.push(`state = '${staffer.territory.state}'`);
+ // 2. Default playbook-pattern geo to staffer's primary city/state
+ cityForPatterns = staffer.territory.cities[0];
+ stateForPatterns = staffer.territory.state;
+ // 3. Surface staffer.name back so the UI can relabel MEMORY → MARIA'S MEMORY
+ response.staffer = { id, name, territory };
+ }
+
+
+ The corpus stays intact. The relevance gradient is per coordinator. As each accumulates fills, their slice of the playbook compounds independently. The architecture generalizes — every new metro adds territories, not code paths.
+
+
Code: mcp-server/index.ts STAFFERS roster + lookupStaffer() · /staffers endpoint · /intelligence/chat smart_search route · UI: staffer dropdown in mcp-server/search.html
+
+
+
+
Chapter 6
+
Construction Activity Signal Engine — the corpus is also a market signal
+
Every contractor in this corpus is also a forward indicator on the public equities they touch. Permits filed today predict construction starts ~45 days out, staffing ~30, revenue recognition months later. The associated-ticker network surfaces this signal before any 10-Q. The architecture is metro-agnostic — Chicago is Phase 1; NYC DOB, LA County, Houston BCD, Boston ISD ship as Socrata-shaped adapters.
+
+
Three flavors of attribution
+
+ // per contractor in /intelligence/profiler_index:
+ direct // contractor IS a public issuer → SEC tickers index match
+ parent // curated KNOWN_PARENT_MAP — Turner → HOC.DE via Hochtief AG
+ associated // co-permit network — Bob's Electric appears with TARGET CORPORATION
+ // 3+ times → inherits TGT as an associated indicator
+
+
+ The associated path is the moat. A staffing-permit dataset that maps contractor-to-public-issuer is not commercially available; we synthesize it from the Socrata co-occurrence graph. Every additional metro multiplies edges.
+
+
+
Building Activity Index (BAI)
+
+ // BAI = attribution-weighted average day-change across surfaced issuers:
+ BAI = Σ (day_change_pct × attribution_count) / Σ attribution_count
+
+ // Indexed build value = total $ of permits attributable to ANY public issuer
+ // Network depth = issuers / total attribution edges
+
+
+ Run BAI daily, save the series, and you've got a backtestable thesis in months. Today's surface is Chicago-only with ~9 issuers; the curve scales linearly with metros added — and the marginal cost of a new metro is one Socrata adapter.
+
+
Code: mcp-server/index.ts /intelligence/profiler_index + /intelligence/ticker_quotes · entity.ts lookupTickerLite() · fetchStooqQuote() · UI: /profiler · Data sources: SEC company_tickers.json (in-memory index) + Stooq CSV API + curated parent-link map
+
+
+
+
Chapter 7
Key architectural choices — what was picked and why
Each choice is documented in docs/DECISIONS.md (Architecture Decision Records). If you dispute any of these, the ADR names the alternatives we rejected and the measurement that drove the call.
@@ -314,62 +386,95 @@ pre{background:#161b22;border:1px solid #171d27;border-radius:8px;padding:14px 1
ADR-020 · Idempotent register() with schema-fingerprint gate
Same (name, fingerprint) reuses manifest. Different fingerprint = 409 Conflict. Prevents silent duplicate manifests. Cleanup run collapsed 374 → 31 datasets.
+
+
ADR-021 · Semantic-correctness matrix layer
Pathway memory carries semantic flags (UnitMismatch, TypeConfusion, OffByOne, StaleReference, DeadCode, BoundaryViolation, …) on every trace. New reviews see prior bug fingerprints as a preamble; recurrent classes get caught on first read. Compounds across files in the same crate.
+
Phase 19 design note · Statistical + semantic, not neural
Meta-index is cosine similarity + endorsement aggregation. No model training. Rebuildable from successful_playbooks alone. Neural re-ranker deferred to Phase 20+ only if statistical floor plateaus.
+
+
Distillation freeze · v1.0.0 at e7636f2
145 tests · 22/22 acceptance · 16/16 audit-full · bit-identical reproducibility. Multi-layer contamination firewall on SFT exports. Substrate the auditor + mode runner sit on; "the system regressed" questions bisect against this anchor.
+
-
Chapter 6
+
Chapter 8
Measured at scale, on this machine
-
Hardware: i9 + 128GB RAM + Nvidia A4000 16GB VRAM. Numbers below are from this running instance. Refresh the page and they'll recompute.
+
Hardware: i9 + 128GB RAM + Nvidia A4000 16GB VRAM + 2.5GB symmetric. Numbers below are from this running instance. Refresh the page and they'll recompute.
-
Chapter 7
+
Chapter 9
Verify or dispute — reproduce it yourself
-
Every claim below is a curl away from falsification.
+
Every claim above is a curl away from falsification.
-
Health. Should return lakehouse ok.
-
curl http://localhost:3100/health
+
Gateway health. Returns provider matrix + worker count.
+
curl -s http://localhost:3100/v1/health | jq
Any SQL on multi-million-row Parquet. Sub-100ms typical.
curl -s -X POST http://localhost:3100/query/sql \
-H 'Content-Type: application/json' \
-d '{"sql":"SELECT role, COUNT(*) FROM workers_500k WHERE state=\"IL\" GROUP BY role LIMIT 5"}'
-
Hybrid search with playbook boost. The whole Phase 19 feedback loop in one request.
+
Hybrid search with playbook boost. SQL filter + vector rerank + playbook memory in one call.
curl -s -X POST http://localhost:3100/vectors/hybrid \
-H 'Content-Type: application/json' \
-d '{"index_name":"workers_500k_v1",
"sql_filter":"role = '\''Forklift Operator'\'' AND city = '\''Chicago'\'' AND CAST(availability AS DOUBLE) > 0.5",
"question":"reliable forklift operator",
"top_k":5,"use_playbook_memory":true,"playbook_memory_k":200}'
-
Playbook memory stats. Count + endorsed names + sample.
-
curl http://localhost:3100/vectors/playbook_memory/stats
-
Pattern discovery. What do past similar fills have in common?
-
curl -s -X POST http://localhost:3100/vectors/playbook_memory/patterns \
+ Pathway memory stats. System-level hot-swap signal — should show 88 traces / 11 replays / 100% reuse rate (probation gate crossed).
+ curl -s http://localhost:3100/vectors/pathway/stats | jq
+ Per-staffer scoping. Same query, different rosters per coordinator.
+ for s in maria devon aisha; do
+ curl -s -X POST http://localhost:3700/intelligence/chat \
+ -H 'Content-Type: application/json' \
+ -d "{\"message\":\"forklift operators\",\"staffer_id\":\"$s\"}" \
+ | jq -r ".staffer.name + \": \" + (.sql_results | length | tostring) + \" workers, top: \" + (.sql_results[0].name + \" in \" + .sql_results[0].city + \", \" + .sql_results[0].state)"
+done
+# Maria: 167 workers, top: ... in Chicago, IL
+# Devon: 89 workers, top: ... in Fort Wayne, IN
+# Aisha: 16 workers, top: ... in Milwaukee, WI
+ Late-worker triage in one shot. Pulls profile + 5 backfills + drafts SMS. Should respond in under 300ms.
+ curl -s -X POST http://localhost:3700/intelligence/chat \
-H 'Content-Type: application/json' \
- -d '{"query":"Forklift Operator in Chicago, IL","top_k_playbooks":25,"min_trait_frequency":0.3}'
- Run the dual-agent scenario yourself. All 5 events, real fills, real artifacts.
+ -d '{"message":"Marcus running late site 4422"}' | jq
+
Construction Activity Signal Engine. Profiler index with attribution, cost, last filed.
+
curl -s -X POST http://localhost:3700/intelligence/profiler_index \
+ -H 'Content-Type: application/json' \
+ -d '{"limit":10}' \
+ | jq '.contractors[] | {name, permits, total_cost, direct: (.tickers.direct | map(.ticker)), associated: (.tickers.associated | map(.ticker + " ←via " + .partner_name))}'
+
Live ticker quotes. Batch Stooq pull for the basket.
+
curl -s -X POST http://localhost:3700/intelligence/ticker_quotes \
+ -H 'Content-Type: application/json' \
+ -d '{"tickers":["TGT","JPM","BALY","WBA","MCD"]}' | jq .quotes
+
Audit trail — read any verdict on PR #11. Independent claim-vs-diff verifier output.
+
ls /home/profit/lakehouse/data/_auditor/kimi_verdicts/
+# 11-c3c9c2174a91.json 11-ca7375ea2b17.json 11-2d9cb128bf42.json …
+jq '.findings[0:3]' /home/profit/lakehouse/data/_auditor/kimi_verdicts/11-c3c9c2174a91.json
+
Distillation acceptance gate. 22/22 invariants must pass for any commit that touches the substrate.
cd /home/profit/lakehouse
-bun run tests/multi-agent/scenario.ts
-# Output: tests/multi-agent/playbooks/scenario-<timestamp>/report.md
+bun test auditor/schemas/distillation/ tests/distillation/
+# Expect: 145 pass · 0 fail · 372 expect() calls
-
Chapter 8
+
Chapter 10
What we are not claiming
-
Every impressive-sounding number comes with a footnote. Here are the honest limits.
+
Every impressive-sounding number comes with a footnote. Here are the honest limits as of 2026-04-27.
-
workers_500k is synthetic.
Real client ATS export replaces this table. Schema is deliberately identical to a production ATS.
-
candidates table has 1,000 rows.
Intentionally small for demo. call_log references higher candidate_ids that don't cross-reference — this is a dataset alignment issue, not a pipeline issue.
-
Chicago permit data is real.
Pulled live from data.cityofchicago.org/resource/ydr8-5enu.json (Socrata API). Not synthetic. Not cached.
-
Playbook memory is seeded from demo runs.
The pipeline that seeds it is identical to what a live recruiter would trigger via /log. Same code path.
-
Local 7B models (mistral, qwen2.5) are imperfect.
They occasionally malform tool calls or drop fields. Multi-agent scenarios seal roughly 40-80% in one run. Larger models or constrained decoding would improve this. Not a substrate problem.
+
workers_500k is synthetic.
Real client ATS export replaces this table. Schema is deliberately identical to a production ATS so the swap is config, not code.
+
candidates table is light at 1,000 rows.
Intentionally small. Live PII-safe view layer is built; replacing the small table with a 100K+ ATS is a one-line config flip.
+
Chicago permit data is real.
Pulled live from data.cityofchicago.org/resource/ydr8-5enu.json (Socrata). Not synthetic. Not cached. Verifiable address-by-address.
+
Playbook memory is seeded from demo runs.
Same code path that seeds in production: every /log from the recruiter UI triggers seed → persist_sql. Demo seeds use the same shape as live operations.
+
Pathway memory probation gate is crossed.
88 traces, 11 replays, 11 successful, 100% reuse rate. Any pathway that fails to clear ≥0.80 success_rate after ≥3 replays gets retired automatically (sticky flag prevents oscillation).
+
SEC name-to-ticker fuzzy matcher has rare false positives.
For names with no clean SEC match the matcher occasionally surfaces a same-keyword small-cap (saw FLG attach to a PNC-adjacent contractor once). Kept conservative — minimum 2 non-stopword overlap. Tightenable to require explicit allow-list for production trading use.
+
12 awaiting public-data sources are placeholders.
DOL Wage & Hour, EPA ECHO, MSHA, BBB, PACER, UCC liens, D&B, etc. — listed by name on every contractor profile with a one-line "would show:" sample. Not yet wired. Each ships as a Socrata-style adapter; engineering scope is concrete.
No rate/margin awareness yet.
Worker pay expectations vs contract bill rates are not modeled. Flagged as a Phase 20 item; no architectural blocker.
+
BAI is a thesis, not a backtested signal.
The Building Activity Index is computed live from current attribution + day-change. To have a backtestable thesis we need the daily series saved over months. Architectural support is there (data/_kb/audit_baselines.jsonl pattern); just hasn't been running long enough.
+
Single-metro today.
Chicago via Socrata. NYC DOB, LA County, Houston BCD, Boston ISD, DC DCRA all use Socrata-equivalent APIs — adapters are config-only. Each new metro multiplies the network without multiplying the codebase.
@@ -394,8 +499,72 @@ function apiPost(path, body){
window.addEventListener('load',function(){
loadLiveSections();
+ loadPathwayLive();
+ loadSignalLive();
});
+// Pathway memory live counters in Chapter 4 — small inline spans.
+function loadPathwayLive(){
+ fetch(A+'/api/vectors/pathway/stats').then(function(r){return r.json()}).then(function(p){
+ if(!p) return;
+ var t=document.getElementById('pwm-traces');
+ var r=document.getElementById('pwm-replays');
+ var rate=document.getElementById('pwm-rate');
+ if(t) t.textContent = (p.total_pathways||0) + ' traces';
+ if(r) r.textContent = (p.successful_replays||0) + '/' + (p.total_replays||0);
+ if(rate) rate.textContent = Math.round((p.replay_success_rate||0)*100) + '%';
+ }).catch(function(){});
+}
+
+// Live tile under Chapter 1 — what the signal engine sees in this view.
+function loadSignalLive(){
+ apiPost('/intelligence/profiler_index',{limit:200}).then(function(d){
+ var host=document.getElementById('ch1-live');if(!host) return;
+ host.textContent='';
+ var rows=d.contractors||[];
+ if(!rows.length) return;
+ // Aggregate basket
+ var byTk={};
+ rows.forEach(function(r){
+ var ts=(r.tickers&&r.tickers.direct?r.tickers.direct:[]).concat(r.tickers&&r.tickers.associated?r.tickers.associated:[]);
+ ts.forEach(function(t){
+ if(!t||!t.ticker) return;
+ if(!byTk[t.ticker]) byTk[t.ticker]={kinds:[],count:0};
+ byTk[t.ticker].count++;
+ if(byTk[t.ticker].kinds.indexOf(t.via)<0) byTk[t.ticker].kinds.push(t.via);
+ });
+ });
+ var basket=Object.values(byTk);
+ var attribCost=rows.reduce(function(s,r){
+ var ts=(r.tickers&&r.tickers.direct?r.tickers.direct:[]).concat(r.tickers&&r.tickers.associated?r.tickers.associated:[]);
+ return s + (ts.length>0 ? (r.total_cost||0) : 0);
+ },0);
+ if(!basket.length) return;
+ var card=el('div','card accent-l');
+ var hdr=el('div',null,'LIVE — Construction Activity Signal Engine');
+ hdr.style.cssText='font-size:10px;color:#3fb950;text-transform:uppercase;letter-spacing:1.4px;font-weight:700;margin-bottom:8px';
+ card.appendChild(hdr);
+ var line=document.createElement('div');
+ line.style.cssText='display:flex;gap:24px;flex-wrap:wrap;font-size:13px';
+ function block(num,lab){
+ var b=document.createElement('div');
+ var n=document.createElement('div');n.style.cssText='font-size:18px;font-weight:700;color:#e6edf3;font-family:ui-monospace,monospace';n.textContent=num;
+ var l=document.createElement('div');l.style.cssText='font-size:10px;color:#545d68;text-transform:uppercase;letter-spacing:1.2px;font-weight:600';l.textContent=lab;
+ b.appendChild(n);b.appendChild(l);return b;
+ }
+ var bav = attribCost>=1e9?'$'+(attribCost/1e9).toFixed(2)+'B':attribCost>=1e6?'$'+(attribCost/1e6).toFixed(0)+'M':'$'+Math.round(attribCost/1e3)+'K';
+ line.appendChild(block(basket.length+'', 'Public issuers in scope'));
+ line.appendChild(block(bav, 'Attributed build value'));
+ line.appendChild(block(rows.length+'', 'Contractors indexed'));
+ line.appendChild(block(basket.reduce(function(s,b){return s+b.count},0)+'', 'Attribution edges'));
+ card.appendChild(line);
+ var note=el('div',null,'Computed live from /intelligence/profiler_index in '+(d.duration_ms||0)+'ms · click any of the chapter-9 curl lines to verify');
+ note.style.cssText='font-size:11px;color:#545d68;margin-top:10px;font-family:ui-monospace,monospace';
+ card.appendChild(note);
+ host.appendChild(card);
+ }).catch(function(){});
+}
+
function loadLiveSections(){
apiPost('/proof.json',{}).then(function(r){
var host1=document.getElementById('ch1-tests');host1.textContent='';