From 468798c9acef90ed74542abd06d75aa96eae46c2 Mon Sep 17 00:00:00 2001 From: root Date: Mon, 20 Apr 2026 17:55:42 -0500 Subject: [PATCH] =?UTF-8?q?/spec:=20technical=20specification=20=E2=80=94?= =?UTF-8?q?=2011-chapter=20README-equivalent?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit J's ask: explain the full architecture so someone reading a README can dispute it or recreate it. The repo isn't public yet; this page IS the spec until it is. Ch1 Repository layout — 13 crates + tests/multi-agent + docs + data, with owned responsibility and file path per crate. Ch2 Data ingest pipeline (8 steps) — sources (file/inbox/DB/cron), parse+normalize with ADR-010 conservative typing, PII auto-tag, dedup, Parquet write, catalog register with fingerprint gate, mark embeddings stale, queryable immediately. Ch3 Measurement & indexing — row count / fingerprint / owner / sensitivity / freshness / lineage per dataset. HNSW vs Lance tradeoff table with measured numbers (ADR-019). Autotune loop. Per-profile scoping (Phase 17). Ch4 Contract inference from external signal — Chicago permit feed → role mapping → worker count heuristic → timeline → hybrid search with boost → pattern discovery → rendered card. All pre-computed before staffer opens UI. Ch5 What a CRM can't do — 11-row comparison table of capabilities. Ch6 How it gets better over time — three paths: - Phase 19 playbook boost (full math) - Pattern discovery meta-index - Autotune agent Ch7 Scale story: 20 staffers, 300 contracts, midday +20/+1M surge - Async gateway + per-staffer profile isolation + client blacklists - 7-step surge handling flow (ingest, stale-mark, incremental refresh, degradation, hot-swap, autotune re-enter) - Known pain points: Ollama inference serial, RAM ceiling ~5M on HNSW (mitigated by Lance), VRAM 1-2 models sequential, playbook_memory unbounded. Ch8 Error surfaces & recovery — 10-row table covering ingest schema conflicts, bucket failures, ghost names, dual-agent drift, empty searches, Ollama down, gateway restart, schema fingerprint divergence. Every failure has a named surface and recovery path. Ch9 Per-staffer context — active profile, workspace, client blacklist, audit trail, daily summary. How 20 staffers don't see the same UI. Ch10 Day in the life — 07:00 housekeeping → 07:30 refresh → 08:00 staffer opens → 08:15 drill down → 08:30 Call click → 09:00 second staffer shares memory → 12:30 surge → 14:00 no-show → 15:00 new embeddings live → 17:00 retrospective → 22:00 overnight trials. Ch11 Known limits & non-goals — deferred (rate/margin, push, confidence calibration, neural re-ranker, pm compaction, call_log cross-ref) and explicitly out-of-scope (cloud, ACID, streaming, CRM replace, proprietary formats, hard multi-tenant). Also: nav updated on /dashboard, /console, /proof to link /spec. Every architectural claim in the spec cites either a code path, an ADR number, or a phase reference so someone skeptical can target the specific artifact. --- mcp-server/console.html | 1 + mcp-server/index.ts | 10 + mcp-server/proof.html | 1 + mcp-server/search.html | 3 +- mcp-server/spec.html | 414 ++++++++++++++++++++++++++++++++++++++++ 5 files changed, 428 insertions(+), 1 deletion(-) create mode 100644 mcp-server/spec.html diff --git a/mcp-server/console.html b/mcp-server/console.html index 5fc2eb8..2039283 100644 --- a/mcp-server/console.html +++ b/mcp-server/console.html @@ -96,6 +96,7 @@ details .body{padding-top:10px;font-size:12px;color:#8b949e} Dashboard Walkthrough Architecture + Spec
Reading live state…
diff --git a/mcp-server/index.ts b/mcp-server/index.ts index 65a73dc..f1c09aa 100644 --- a/mcp-server/index.ts +++ b/mcp-server/index.ts @@ -639,6 +639,16 @@ async function main() { }); } + // Spec — technical specification / README-equivalent document. + // Long-form architecture doc: folder layout, ingest pipeline, + // scale story, error surfaces, per-staffer context, a day in + // the life. Intended for a skeptical reader who needs to + // dispute or reproduce what the system claims to do. + if (url.pathname === "/spec") { + return new Response(Bun.file(import.meta.dir + "/spec.html"), { + headers: { ...cors, "Content-Type": "text/html" }, + }); + } // Proof JSON API (same data, no HTML) if (url.pathname === "/proof.json") { diff --git a/mcp-server/proof.html b/mcp-server/proof.html index c340d14..4b44168 100644 --- a/mcp-server/proof.html +++ b/mcp-server/proof.html @@ -82,6 +82,7 @@ pre{background:#161b22;border:1px solid #171d27;border-radius:8px;padding:14px 1 Dashboard Walkthrough Architecture + Spec
Running live tests…
diff --git a/mcp-server/search.html b/mcp-server/search.html index e228907..9578ed7 100644 --- a/mcp-server/search.html +++ b/mcp-server/search.html @@ -103,8 +103,9 @@ body{font-family:'Inter',-apple-system,system-ui,'Segoe UI',sans-serif;backgroun

Staffing Co-Pilot

Loading...
diff --git a/mcp-server/spec.html b/mcp-server/spec.html new file mode 100644 index 0000000..35a71b9 --- /dev/null +++ b/mcp-server/spec.html @@ -0,0 +1,414 @@ + + + +Lakehouse — Technical Specification + + + +
+

Lakehouse — Technical Specification

+ +
v1 · 2026-04-20
+
+ +
+ +
+ + +
+
Chapter 1
+

Repository layout

+
What lives where. Every folder below has a single, bounded responsibility. A maintainer reading this should know — in under ten minutes — which crate owns a failing behavior.
+ + + + + + + + + + + + + + + + + + + +
PathOwns
crates/shared/Types, errors, Arrow helpers, schema fingerprints, PII detection, secrets provider. Every other crate depends on this.
crates/storaged/Raw object I/O. BucketRegistry (multi-bucket, rescue-aware), AppendLog (write-once batched append), ErrorJournal (bucket op failures). ADR-017 (federation), ADR-018 (append pattern).
crates/catalogd/Metadata authority. Dataset manifests, schema fingerprints (ADR-020), tombstones (soft delete), AI-safe views, model profiles (Phase 17). In-memory index persisted as Parquet on storage.
crates/queryd/SQL engine. DataFusion over Parquet + MemTable cache + delta merge-on-read + compaction. Registers every bucket as an object_store so SQL can join across them.
crates/ingestd/Data on-ramp. CSV / JSON / PDF (+OCR via Tesseract) / Postgres streaming / MySQL streaming / inbox watcher / cron schedules. Every ingest path auto-tags PII (emails, phones, SSNs, addresses), records lineage, and marks embeddings stale.
crates/vectord/The vector + learning surface. Embeddings stored as Parquet (ADR-008), HNSW index (Phase 15), trial system (autotune), promotion registry (Phase 16), playbook_memory (Phase 19). Core feedback loop lives here.
crates/vectord-lance/Firewall crate. Lance 4.0 + Arrow 57, isolated from the main Arrow-55 workspace. Provides secondary vector backend for large-scale, random-access, and append-heavy workloads (ADR-019).
crates/journald/Append-only mutation event log (ADR-012). Every insert/update/delete writes here — who, when, what, old/new value. Never mutated. Foundation for time-travel + compliance audit.
crates/aibridge/Rust ↔ Python sidecar. HTTP client over FastAPI wrapper around Ollama. VRAM introspection via nvidia-smi. All LLM calls (embed, generate, rerank) flow through here.
crates/gateway/Axum HTTP (:3100) + gRPC (:3101). Auth middleware, tools registry (Phase 12 — governed actions), CORS. Every external request enters here.
crates/ui/Dioxus WASM developer UI. Internal tool. Not exposed externally.
mcp-server/Bun/TypeScript recruiter-facing app. Serves devop.live/lakehouse. Routes: /search /match /log /log_failure /clients/:c/blacklist /intelligence/*. Proxies to the Rust gateway for heavy work.
tests/multi-agent/Dual-agent scenario harness. agent.ts (prompts + protocol), orchestrator.ts (single task), scenario.ts (5-event warehouse week), run_e2e_rated.ts (parallel pairs + rating), chain_of_custody.ts (layer-by-layer audit).
docs/PRD.md, PHASES.md, DECISIONS.md (20 ADRs). Every significant architectural choice has an ADR with the alternatives that were rejected and why.
data/Default local object store. Parquet files per dataset, append-log batches, HNSW trial journals, promotion registries, playbook_memory state.json, catalog manifests. Rebuildable from repo + this dir alone.
+
+ + +
+
Chapter 2
+

Data ingest pipeline

+
How staffing data gets into the system — whether from a CSV drop, an ATS export, a Postgres replica, or a PDF resume. Every path ends at the same place: a registered dataset with known schema, known lineage, known sensitivity.
+ +
1
Source arrives. Four shapes: (a) file upload via POST /ingest/file, (b) inbox watcher (drops in ./inbox/ → auto-ingested in under 15s), (c) Postgres or MySQL streaming connector (POST /ingest/db with DSN), (d) scheduled ingest via ingestd::schedule with cron.
+ +
2
Parse + normalize. CSV parser infers types per column; defaults to String on ambiguity (ADR-010 — better to ingest everything than reject on type mismatch). JSON parser flattens nested objects. PDF extractor uses lopdf first; falls back to Tesseract OCR for scanned/image PDFs. Output is always an Arrow RecordBatch.
+ +
3
Auto-detect PII. shared::pii scans column values and names. Identifies emails, phone numbers, SSNs, salaries, street addresses, medical terms. Tags columns with sensitivity: PII | PHI | Financial | Internal | Public (Phase 10 catalog v2).
+ +
4
Deduplicate by content hash. Every uploaded file's SHA-256 is checked against the catalog's seen-hash log. Re-ingesting the same file is a no-op (ADR invariant #5).
+ +
5
Write Parquet to object storage. arrow_helpers::record_batch_to_parquetstoraged::ops::put → file lands under data/datasets/<name>.parquet (or bucket-scoped via BucketRegistry). Schema fingerprint computed.
+ +
6
Register in catalog. catalogd::Registry::register(name, fingerprint, objects) — idempotent on (name, fingerprint). Same name + same fingerprint = reuse manifest, bump updated_at. Same name + different fingerprint = 409 Conflict (ADR-020 — prevents silent schema drift). New name = create new manifest with owner, lineage, freshness SLA, column metadata, PII tags.
+ +
7
Mark embeddings stale. If the dataset already has a vector index, the new rows mean that index is now behind. Registry::mark_embeddings_stale flips a flag; POST /vectors/refresh/<dataset> runs an incremental re-embed (only new rows, not the whole corpus).
+ +
8
Queryable immediately. queryd::context picks up the new manifest on next query. Hot-cache warms on first hit. Delta merge-on-read means updates land without rewriting the base Parquet.
+ +
Code: crates/ingestd/src/{service.rs, csv.rs, json.rs, pdf.rs, pg_stream.rs, my_stream.rs, schedule.rs}
+
+ + +
+
Chapter 3
+

Measurement & indexing

+
Once data is in, the system describes it rigorously and builds fast-access indexes over the parts that will be queried. Every measurement is deterministic, versioned, and visible via HTTP.
+ +

What gets measured per dataset

+
    +
  • Row count (from parquet footer, not a SELECT COUNT). O(1).
  • +
  • Schema fingerprint — SHA-256 over (column_name, type, nullability, sort) tuples. Drives ADR-020 idempotent register.
  • +
  • Owner / sensitivity / freshness SLA — catalog v2 metadata. PII auto-detected; owner assigned on ingest.
  • +
  • Lineage — source_system → ingest_job → dataset. Who put this here, when, from what.
  • +
  • Last embedded at — when the vector index covering this dataset was last refreshed. Drives stale-detection.
  • +
+ +

How vector indexes are built

+

Two backends, chosen per profile (ADR-019):

+ + + + + + + + + + + +
HNSW over Parquet (primary)Lance (secondary)
StorageEmbeddings as Parquet columns (doc_id, chunk_text, vector)Lance native dataset
IndexHNSW in RAM, serialized sidecarIVF_PQ on disk
Build time (100K × 768d)~230s~16s (14× faster)
Search p50 (100K)~873μs~7.4ms at recall 1.0
AppendRewrite requiredStructural (0.08s for 100 rows)
Random fetch by doc_idFull scan~311μs (112× faster)
RAM ceiling~5M vectorsScales past RAM — disk-resident
+ +

Autotune

+

The vectord::agent background task runs continuously. Per index, it proposes HNSW configurations (ef_construction × ef_search), executes a trial against a stored eval set, journals the result as JSONL, and — if recall beats the min_recall gate (0.9) and latency wins the Pareto test — promotes the new config atomically via promotion_registry. No downtime. Rollback in milliseconds.

+ +

Per-profile / per-staffer indexing

+

Model profiles (Phase 17) are not routing strings — they are named scopes. Each profile has bound_datasets[], hnsw_config, vector_backend, and bucket. When a staffer activates a profile:

+
    +
  • EmbeddingCache warms for bound indexes only
  • +
  • HNSW is rebuilt with the profile's config (if different from current)
  • +
  • Search via POST /vectors/profile/<id>/search rejects out-of-scope queries with 403 + list of allowed bindings
  • +
  • Ollama swaps to the profile's model via keep_alive=0; only one model in VRAM at a time
  • +
+
Code: crates/vectord/src/{hnsw.rs, autotune.rs, agent.rs, promotion.rs} · ADR-019
+
+ + +
+
Chapter 4
+

Contract inference from external signal

+
Most CRMs wait for a contract to land. This system watches upstream demand and pre-builds the ranking before the contract lands.
+ +

The concrete example running on devop.live/lakehouse is Chicago Department of Buildings permit data (public Socrata API). Every permit is a signal that construction — and therefore staffing — is coming.

+ +

Flow

+
1
Fetch. /intelligence/market and /intelligence/permit_contracts hit data.cityofchicago.org/resource/ydr8-5enu.json live. No caching of permit data — every page load is fresh.
+
2
Map work_type → role. Industry dictionary: "Electrical Work" → "Electrician", "Masonry Work" → "Production Worker", "Mechanical Work" → "Maintenance Tech", etc.
+
3
Derive worker count. Heuristic: ~1 worker per $150K of permit cost, capped 2-8 per contract for staffing realism. Operator-configurable when real client history is available.
+
4
Derive timeline. Permit issued → construction starts ~45 days later → staffing window opens ~14 days before construction. Classifies each permit as overdue, urgent, soon, scheduled.
+
5
Run hybrid search against the bench. For each derived contract, POST /vectors/hybrid with sql_filter on role+state+city+availability, use_playbook_memory: true, playbook_memory_k: 200. Returns top-5 candidates with boost + citations.
+
6
Query the meta-index. POST /vectors/playbook_memory/patterns aggregates traits across similar past playbooks — recurring certs, skills, archetype, reliability distribution. Surfaces signal the operator didn't query for.
+
7
Render on the dashboard. Each card shows permit + derived contract + top 3 candidates with memory chips + discovered pattern + urgency. All of this pre-computed before any staffer opens the UI.
+ +

Coverage forecast

+

/intelligence/staffing_forecast aggregates the last 30 days of permits into predicted role-level demand, joins against the IL bench supply, computes coverage %, and classifies each role as critical / tight / watch / ok. The dashboard's top panel renders this — staffers see supply gaps before they query.

+
+ + +
+
Chapter 5
+

What a CRM can't do (and why)

+
A CRM stores. This system infers, predicts, re-ranks, and compounds. The six capabilities below are load-bearing — missing any of them is the gap between "software that logs calls" and "software that makes the next call better."
+ +
+ + + + + + + + + + + + + + + +
CapabilityCRMThis system
Store candidate recordsYesYes (workers_500k, candidates)
Search by structured fieldYesYes (DataFusion SQL, sub-100ms on 3M rows)
Search by semantic meaningNoYes (HNSW + nomic-embed-text)
Combine SQL filter + semantic rankNoYes (/vectors/hybrid)
Boost workers based on past successNoYes (Phase 19 playbook_memory)
Penalize workers based on past failureNoYes (/log_failure + 0.5n penalty)
Surface traits across past fillsNoYes (/vectors/playbook_memory/patterns)
Predict staffing demand from external dataNoYes (Chicago permit feed + 30-day rolling forecast)
Count down to staffing deadline per contractNoYes (permit issue_date + heuristic timeline)
Explain why each candidate rankedNoYes (boost chip + narrative citations + memory pattern)
Improve ranking from operator actionsNoYes (every Call/SMS/No-show click → re-rank signal)
+
+
+ + +
+
Chapter 6
+

How it gets better over time

+
Compounding learning in three paths — all three happen automatically, no operator intervention required.
+ +

Path 1 — Playbook boost (Phase 19)

+

Every sealed fill is seeded to playbook_memory via /vectors/playbook_memory/seed. The next hybrid query for a semantically similar role+geo surfaces the past endorsed workers with a boost. Math:

+
per_worker = cosine(query_emb, playbook_emb) × 0.5 × e^(-age/30) × 0.5^failures / n_workers
+boost[(city, state, name)] = min(Σ per_worker, 0.25)
+

Caps, decay, and negative signal mean one popular worker can't dominate, old playbooks fade, and no-shows stop boosting. Verified live: 3 identical seeds → +0.250 boost capped, 3 citations.

+ +

Path 2 — Pattern discovery (meta-index)

+

/vectors/playbook_memory/patterns goes beyond "who was endorsed" to answer "what did past similar fills have in common?" Aggregates recurring certifications, skills, archetype, reliability distribution across the top-K semantically similar playbooks. Surfaces signal the operator didn't explicitly query for.

+ +

Path 3 — Autotune agent

+

The vectord::agent background task runs continuously. Watches the HNSW trial journal, proposes configs, executes trials, promotes Pareto winners — without human intervention. Operator sees "the index got faster overnight" and doesn't know why. The journal knows why.

+
+ + +
+
Chapter 7
+

Scale story — 20 staffers, 300 contracts, a surge

+
What happens when the demo-level load becomes the production-level load, and midday a client pushes 20 more contracts plus a 1M-row ATS delta. Honest: some of this is architectural headroom, not measured scale. The designed behaviors are below.
+ +

20 concurrent staffers

+

Axum is async. The gateway handles concurrent requests on Tokio with work-stealing. No per-request thread. Tested at 10 parallel queries in 82ms total on this hardware.

+

Per-staffer profile isolation. Each staffer activates their own profile (Phase 17) or workspace (Phase 8.5). Profile scopes their search to bound datasets. Workspace carries their in-progress contracts across sessions.

+

Per-client blacklists. Auto-applied when the caller passes client: "X" on /search. Staffer A filling for Acme never sees Acme's flagged workers. Staffer B filling for MidState sees them normally.

+ +

300 active contracts

+

SQL on job_orders is cheap. 300 rows is nothing — a scan is microseconds.

+

Workspace per contract. Each contract gets its own workspace with saved searches, shortlists, activity log. Zero-copy handoff between staffers (pointer swap, not data copy).

+

Forecast remains coherent. /intelligence/staffing_forecast aggregates 30-day permit data regardless of contract count. The bench supply query (GROUP BY role over workers_500k) is a single sub-second SQL.

+ +

Midday surge: +20 contracts, +1M profiles

+

The delta arrives at 12:30. Here's what happens in the following minutes:

+
1
+20 contracts via /ingest/db or /ingest/file. Parsed, schema-checked, Parquet-written, catalog-registered. No queries blocked — register holds a write lock across the manifest write only.
+
2
+1M worker profiles arrives as delta to workers_500k. Append-log pattern (ADR-018) means the new rows write to a fresh batch file — base Parquet is NOT rewritten. Queries against workers_500k immediately merge-on-read the new batches.
+
3
Embeddings marked stale. The vector index for workers_500k_v1 now has 1M rows it hasn't seen. mark_embeddings_stale flips the flag.
+
4
Incremental refresh fires. POST /vectors/refresh/workers_500k reads only the new rows (diff against existing embeddings), embeds them in batches of 64 via Ollama, writes delta embedding Parquet. Measured on threat_intel: 34 new rows in 970ms (6× faster than full re-embed).
+
5
Search degrades gracefully. During the refresh, searches against workers_500k_v1 still work — they serve from the old embeddings. Brute-force cosine over new-rows-without-embeddings is allowed but costs more. HNSW rebuild happens after all embeds complete.
+
6
Hot-swap promotion. When the new index is ready, promotion_registry atomically flips the active pointer. Next search hits the new config. Rollback stays available.
+
7
Autotune re-enters the loop. The agent queue picks up a DatasetAppended trigger and schedules a fresh HNSW trial cycle against the expanded index.
+ +

Known pain points at this scale

+
    +
  • Ollama inference is serial. Embedding 1M rows at ~50 chunks/sec through nomic-embed-text = ~6 hours. Acceptable for overnight refresh, not for "immediate." Mitigated by incremental refresh (only deltas).
  • +
  • RAM ceiling on HNSW. Around 5M vectors × 768d, HNSW stops fitting in 128GB comfortably. Mitigation: per-profile vector_backend: lance flip — disk-resident IVF_PQ scales past the RAM line (ADR-019).
  • +
  • VRAM ceiling for model variety. A4000 16GB holds 1-2 loaded models. Multi-model recruiter surfaces are a sequential swap, not parallel (Ollama keep_alive=0). Phase 17 profile activation unloads the prior model on swap.
  • +
  • playbook_memory growth. Currently unbounded. 391 entries today at this rate becomes ~5K in six months. Default k=200 still sub-ms at 5K. Compaction policy (TTL + decay + merge) deferred.
  • +
+
+ + +
+
Chapter 8
+

Error surfaces & recovery

+
Every failure mode has a named surface, a structured response, and a recovery path. No silent failures.
+ + + + + + + + + + + + + + + +
Failure modeSurface / responseRecovery
Ingest receives file with schema mismatch vs existing dataset409 Conflict with both fingerprints named (ADR-020)Re-ingest under a new name, or migrate the existing via Phase 14 schema evolution
Bucket unreachable on writeHard 503, error journaled to primary://_errors/bucket_errors/GET /storage/errors lists failures; GET /storage/bucket-health shows per-bucket status
Bucket unreachable on readRescue bucket fallback, X-Lakehouse-Rescue-Used: true header on responseResponse still succeeds; operator sees rescue flag
/log receives name that doesn't exist in workers_500kSeed is SKIPPED; response includes rejected_ghost_names: [...] and a noteOperator sees exactly which names were rejected and why
Dual-agent executor malforms tool callResult appended to log with error field; counter incrementsAfter 3 consecutive: abort with full log dump at tests/multi-agent/playbooks/<id>-FAILED.json
Dual-agent drifts from targetReviewer verdict = drift, counter incrementsAfter 3 consecutive drifts: abort with full log
Hybrid search finds zero candidatesReturns empty sources[] + sql_matches: 0Gap signal captured by scenario runner; operator prompted to broaden filter
Ollama sidecar down502 Bad Gateway from aibridge; embed calls fail fastRestart: systemctl restart lakehouse-sidecar; vector search falls back to pre-computed embeddings
Gateway restart mid-operationIn-memory state (playbook_memory, HNSW) reloaded from persisted state.json / trial journalsZero data loss; catalog, storage, journals are all source-of-truth
Schema fingerprint diverges across manifestscatalog::dedupe reports DedupeReport with winner selection (non-null row_count first, then newest updated_at)POST /catalog/dedupe collapses duplicates idempotently
+
+ + +
+
Chapter 9
+

Per-staffer context

+
Twenty staffers don't see the same UI state. Each one's session is shaped by their active profile, their workspaces, their assigned contracts, and their client's blacklists.
+ +

Active profile (Phase 17)

+

Scopes every search. A staffing-recruiter profile bound to workers_500k sees only that dataset. A security-analyst profile bound to threat_intel cannot see worker data. GET /vectors/profile/<id>/audit records every tool invocation by model identity.

+ +

Workspace (Phase 8.5)

+

Per-contract state. Each workspace has daily/weekly/monthly tiers, saved searches, shortlists, activity logs. Survives across sessions. Instant zero-copy handoff between staffers — pointer swap, not data copy. Persisted to object storage, rebuilt on startup.

+ +

Client blacklist

+

Per-client worker exclusion. Populated via POST /clients/:client/blacklist. Auto-applied when the caller passes client: "X" on /search. JSON-backed; would move to catalog table under real client load.

+ +

Audit trail

+

Phase 12 tool registry logs every governed-action invocation (who called what, with what args, when, outcome). GET /tools/audit queryable. Phase 13 access control layers on top — role-based field masking, query audit log.

+ +

Daily summary per staffer

+

Workspace activity log + per-staffer filter on the event journal gives "what did Sarah do today" as a direct query. The foundation for shift-handoff reports.

+
+ + +
+
Chapter 10
+

A day in the life — from morning brief to EOD retrospective

+
Concrete operator timeline. Every step touches a real endpoint that exists today.
+ +
07:00
Overnight housekeeping. Scheduled ingest runs — the configured cron picks up the client's latest ATS CSV delta, runs it through the pipeline in Ch2, marks workers_500k embeddings stale. Autotune agent promotes any Pareto-winner HNSW configs from overnight trials.
+ +
07:30
Embedding refresh. Background job re-embeds the new rows. Old index keeps serving. Hot-swap promotes when done.
+ +
08:00
Sarah (staffer) opens devop.live/lakehouse. Page loads in ~3s. Forecast panel shows: "$275M construction coming, 4 tight roles this week." Live Contracts section shows 6 Chicago permits with proposed fills + boost chips + pattern signals.
+ +
08:15
Sarah drills into a $5M permit. Top candidate card: Carmen Green, Endorsed · 3 playbooks chip, boost +0.166, pattern line reads "leader archetype · 47% OSHA-10." Sarah hovers the chip — narrative tooltip: "filled Welder x2 in Toledo (2026-04-15), Welder x1 in Toledo (2026-04-18)."
+ +
08:30
Sarah calls Carmen. Clicks Call button → /log fires → playbook_memory.seedpersist_sql → successful_playbooks_live grows by one. Button flashes "Logged" for 1.4s. No modal, no form, no second click.
+ +
09:00
Kim (another staffer) opens the same UI. Her profile loads. Her workspaces show her own contracts. She searches "reliable forklift Chicago" — MEMORY chip shows the pattern discovered across Sarah's morning work AND prior fills. Carmen, already logged by Sarah, shows up with an updated citation count.
+ +
12:30
Client pushes 20 new contracts + 1M ATS delta. Ch7 scale flow fires. Ingest in seconds; embedding refresh kicks off as a background job. Searches continue against old embeddings.
+ +
14:00
Emergency: worker Dave no-showed. Sarah clicks No-show button on Dave's card → /log_failuremark_failed records a penalty. Next similar query dampens Dave's boost by 0.5. Sarah continues the refill — the refill excludes Dave and the 2 others already booked for this shift.
+ +
15:00
New embeddings live. Hot-swap promotion. Searches now see all 1M new profiles. Sarah's noon query re-run would produce different top-5.
+ +
17:00
End-of-day retrospective. Any staffer who ran tests/multi-agent/scenario.ts gets report.md auto-generated. Workspace activity logs aggregate per staffer. GET /vectors/playbook_memory/stats shows the day's new entries.
+ +
22:00
Overnight trial cycle. Autotune agent continues in the background. Trial journal grows. Tomorrow morning, the system is measurably better at something it got asked about today.
+ +

SMS + email drafts in the pipeline

+

After each sealed fill (via scenario.ts or manual /log flow with downstream hooks), generateArtifacts in the scenario runner produces: (a) one SMS per worker (TO: Name, message under 180 chars), (b) one client confirmation email. Drafts are saved to sms.md and emails.md under the scenario output dir. Ollama drafts them; the staffer reviews and sends. No auto-send; human-in-the-loop.

+
+ + +
+
Chapter 11
+

Known limits & non-goals

+
Honesty is a feature. Everything below is either deferred or explicitly out of scope.
+ +

Deferred — real architectural work, just not shipped yet

+
    +
  • Rate / margin awareness. Worker pay expectations vs contract bill rate not modeled. Requires adding pay_rate to workers, bill_rate to contracts, and a filter + warning path. Phase 20 item.
  • +
  • Push / background presence. The app requires being opened. No Slack/email/SMS push when a contract lands with a pre-ranked candidate list. Would make the "system is already thinking" claim more visible to phone-first shops.
  • +
  • Confidence calibration. Top-K is a rank, not a probability. No calibrated "85% likely to accept" score. Requires outcome-labeled training data.
  • +
  • Neural re-ranker. Phase 19 is statistical + semantic. A (query, candidate, outcome)-trained re-ranker is deferred to Phase 20+ per ADR, only if the statistical floor plateaus below usable recall.
  • +
  • playbook_memory compaction. No TTL or merge policy. Entries accumulate. At expected rate this hits 10K in a year — still tractable but warrants a policy.
  • +
  • call_log cross-reference. Infrastructure present; current synthetic candidates table is too small to cross-ref. Fixes when real ATS lands.
  • +
+ +

Non-goals — explicitly out of scope

+
    +
  • Cloud deployment. Local-first by design. Works offline after setup.
  • +
  • Full ACID transactions. Single-writer model is sufficient; Delta Lake-grade MVCC is deliberately not attempted.
  • +
  • Real-time streaming / CDC. Batch ingest is the model. Scheduled refresh, not transactional replication.
  • +
  • Replacing the CRM. This is the analytical + AI layer behind the CRM. Operational CRUD stays with the existing system.
  • +
  • Custom file formats. Parquet for datasets, sidecar indexes for vectors. No proprietary formats (ADR-008, ADR-018 reaffirm).
  • +
  • Hard multi-tenant isolation. Profiles and federation provide soft isolation. Adversarial multi-tenant is not a goal — this system assumes a single-trust operator.
  • +
+ +
+Overall bet. The substrate is conservative: Parquet + DataFusion + HNSW + Ollama + object storage. Every layer is replaceable, open, auditable. The intelligence layer (playbook_memory, patterns, autotune) is statistical, not neural — cheaper, explainable, rebuildable from the journal alone. If the statistical floor plateaus below what a real client needs, Phase 20+ adds neural re-rank on top. We don't make that call until measurement demands it. +
+
+ +
+
+ + + +