6 Commits

Author SHA1 Message Date
root
631b0329b1 demo: proof — full architecture-page rewrite for current state
J: "needs a rewrite." Old version was anchored on a dual-agent
mistral+qwen2.5 loop that hasn't been the model story for weeks,
called the system 13 crates (it's 15), referenced "Local 7B models"
in the honest-limits section, and had no mention of:
  - the 40-model OpenCode fleet via one sk-* key
  - the 9-rung cloud-first ladder
  - N=3 consensus + cross-architecture tie-breaker
  - auditor cross-lineage (Kimi K2.6 ↔ Haiku 4.5, Opus auto-promote)
  - distillation v1.0.0 frozen substrate (e7636f2)
  - pathway memory (88 traces, 11/11 replays, ADR-021)
  - per-staffer hot-swap index
  - Construction Activity Signal Engine + BAI + ticker network
  - the gateway as OpenAI-compat drop-in middleware

Rewrote into 10 chapters:

  1.  Receipts — live tests + new live tile showing the Signal Engine
      view for THIS load (issuer count, attributed build value,
      contractor count, attribution edges)
  2.  Architecture — corrected to 15 crates with current responsibilities;
      ASCII diagram showing OpenAI consumers + MCP + Browser all hitting
      gateway /v1/*; provider fleet table with all 5 (ollama, ollama_cloud,
      openrouter, opencode 40-model, kimi); validator + truth + auditor
      crates added
  3.  Model fleet — REPLACED the dual-agent mistral story. Now: the
      9-rung ladder (kimi-k2:1t through openrouter:free → ollama local),
      N=3 consensus + tie-breaker math, auditor Kimi↔Haiku alternation
      with Opus auto-promote on big diffs, distillation v1.0.0 freeze
      tag e7636f2 (145 tests · 22/22 · 16/16 · bit-identical)
  4.  Two memory layers — kept playbook content (Phase 19 boost math
      still load-bearing), added pathway memory (ADR-021) section with
      live counters in the page (88 / 11-11 / 100% reuse rate)
  5.  Per-staffer hot-swap — NEW. Pseudocode showing how staffer_id
      scopes state filter + playbook geo + UI relabel to MARIA'S MEMORY
  6.  Construction Activity Signal Engine — NEW. Three attribution
      flavors (direct, parent, associated), BAI math, cross-metro
      replication framing (NYC DOB next, then LA / Houston / Boston)
  7.  Architectural choices — added ADR-021 row + distillation freeze row
  8.  Measured at scale — kept (uses /proof.json scale data)
  9.  Verify or dispute — REFRESHED with current endpoints. Removed the
      stale "bun run tests/multi-agent/scenario.ts" recipe; added curl
      examples for /v1/health, pathway/stats, per-staffer scoping (3-loop
      bash script), late-worker triage, profiler_index, ticker_quotes,
      auditor verdicts, distillation acceptance gate
  10. What we are NOT claiming — REFRESHED. Removed "Local 7B models"
      caveat; added: 12 awaiting public-data sources are placeholders,
      SEC name-fuzzy has rare false positives, BAI is a thesis not a
      backtest yet, single-metro today

Live data probes added:
  loadPathwayLive   — fills pwm-traces / pwm-replays / pwm-rate spans
  loadSignalLive    — renders the LIVE Signal Engine tile under Ch1

Nav also gained a Profiler link to match search.html and console.html.

Verified end-to-end on devop.live/lakehouse/proof:
  10 chapters render, 5/5 live tests pass, pathway shows 88 traces +
  100% reuse rate, live signal tile shows 11 issuers + $347M attributed
  + 200 contractors + 14 attribution edges. Architecture diagram +
  crate table accurate as of HEAD.
2026-04-28 06:01:04 -05:00
root
a117ae8b38 Workspace UI — surface Phase 8.5 per-contract state + handoff
Phase 8.5 was fully built on the Rust side (WorkspaceManager with
create/handoff/search/shortlist/activity/get/list, persisted to
object storage, zero-copy handoff between agents). Nothing surfaced
it in the recruiter UI. This page closes that gap.

/workspaces — split-pane UI:

Left: scrollable list of all workspaces, sorted by updated_at.
  Each card shows name, tier pill (daily/weekly/monthly/pinned),
  current owner, count of shortlisted candidates + activity events.

Right: selected workspace detail with five sections:
  1. Header — name, tier, owner, created/updated dates, description,
     previous-owners audit trail (each handoff is preserved)
  2. Actions row — Hand off, Shortlist candidate, Save search, Log activity
  3. Shortlist — candidates flagged with dataset + record_id + notes
  4. Saved searches — named SQL queries the staffer wants to rerun
  5. Activity — chronological (newest first) log of what happened

Four modals for the add/edit actions (create, handoff, shortlist,
save-search, log-activity). All forms POST through the existing
/api/* passthrough to the gateway's /workspaces/* routes.

End-to-end verified live:
  1. Sarah creates 'Demo: Toledo Week 17' workspace
  2. Shortlists Helen Sanchez (W500K-4661) with notes about prior endorsements
  3. Logs activity: 'called — Helen confirmed Tuesday 7am shift'
  4. Hands off to Kim with reason 'end of shift'
  5. Kim opens the workspace: owner=kim, previous_owners=[{sarah→kim}],
     sees all 3 prior events + the shortlisted Helen
     — no data copy, pointer swap only (Phase 8.5 design)

Security: all dynamic content built via el(tag,cls,text) DOM helper.
Zero innerHTML on API-derived strings. Modal close-on-backdrop-click
is guarded to the backdrop element.

Nav updated across all 7 pages. Workspaces is the 7th tab.
Dashboard · Walkthrough · Architecture · Spec · Onboard · Alerts · Workspaces.
2026-04-20 18:36:51 -05:00
root
6287558493 Push/daemon presence: background digest + /alerts settings page
Converts the app from 'dashboard you visit' to 'system that finds you.'
Critical for the phone-first staffing shop that won't open a URL —
the system reaches out when something matters.

Daemon:
- Starts once per Bun process (guarded via globalThis sentinel)
- Default interval 15 min (configurable, min 1, max 1440)
- On each cycle, buildDigest() compares current state against prior
  snapshot persisted in mcp-server/data/notification_state.json
- Events detected:
  - risk_escalation: role moved to tight or critical (was ok/watch)
  - deadline_approaching: staffing window falls within warn window
    (default 7 days) AND deadline date differs from prior
  - memory_growth: playbook_memory entries grew by >= 5 since last run

Channels (all opt-out individually via config):
- console: always on, logged to journalctl -u lakehouse-agent
- file: always on, appends JSONL to mcp-server/data/notifications.jsonl
- webhook: optional, POSTs {text, digest} to configured URL
  (Slack incoming-webhook / Discord webhook / any custom endpoint)

Digest format (human-readable, fits in a Slack message):
  LAKEHOUSE DIGEST — 2026-04-20 23:24
  3 staffing deadlines within window:
    • Production Worker — 2d to 2026-04-23 · demand 724
    • Maintenance Tech — 4d to 2026-04-25 · demand 32
    • Electrician — 5d to 2026-04-26 · demand 34
  +779 new playbooks (total 779, 2204 endorsed names)
  snapshot: 0 critical · 0 tight · $275,599,326 pipeline

/alerts page:
- Current status table (daemon state, interval, webhook, last run)
- Config form: enable toggle, interval, deadline warn window, webhook
  URL + label (saved to data/notification_config.json)
- 'Fire a test digest now' button — force a cycle without waiting
- Recent digests panel shows the last 10 dispatches with full text

End-to-end verified live:
- Daemon armed successfully on startup
- First-run digest dispatched to console + file in <1s
- Events detected correctly: 3 deadlines within 7 days from real
  Chicago permit data; 779 playbook entries surfaced as memory growth
- Digest text format is Slack-pastable
- Dispatch records appear in /alerts recent list

TDZ caveat: startAlertsDaemon() invocation moved to end of module so
all const/let in the alerts block evaluate before daemon reads them.
Previously failed with 'Cannot access X before initialization' when
the call lived near the top of the file. Nav added to all 6 pages:
Dashboard · Walkthrough · Architecture · Spec · Onboard · Alerts.
2026-04-20 18:24:48 -05:00
root
23eb04a145 Onboarding wizard — ingest any staffing CSV in 3 steps
New /onboard page. Client-facing wizard for getting real data into
the system without engineering help.

Flow:
1. Drop a CSV (or click 'Use the sample as my data' — ships a 25-row
   realistic staffing roster under /samples/staffing_roster_sample.csv)
2. Browser parses client-side. Columns auto-typed (text/int/decimal/
   date). PII flagged by name hint AND content regex (emails, phones).
   First rows previewed. Read-only — nothing written yet.
3. Name the dataset (lowercase+underscores). Commit.
4. Post-commit: dataset is live. Shows 4 next steps the operator can
   take (SQL query, vector index, dashboard search, playbook training).

Backend:
- /onboard serves onboard.html
- /samples/*.csv serves CSV files from mcp-server/samples/ with
  filename validation (only [a-zA-Z0-9_-.]+.csv, prevents path traversal)
- /onboard/ingest forwards multipart/form-data to gateway /ingest/file
  preserving the boundary. The generic /api/* passthrough breaks
  multipart because it reads as text and forwards as JSON; this route
  uses arrayBuffer + original Content-Type.

Verified end-to-end: upload sample roster (25 rows, 12 columns) →
parse in browser → show columns + PII flags + preview → commit →
gateway writes Parquet, registers in catalog → immediately queryable:
  SELECT * FROM onboard_demo2 LIMIT 3
  → Sarah Johnson, Forklift Operator, Chicago, IL, 0.92
Round-trip <1 second.

Nav updated on all pages to link Onboard. Shipped with a sample CSV
so the full flow is demonstrable without real client data.

When a real client shows up, same path — they upload their CSV.
No engineering ticket, no code change, no schema pre-definition.

Security: sample filename regex prevents path traversal. CSV parse
is client-side pure JS (no DOM injection). Commit uses existing
/ingest/file validation (schema fingerprint, PII server-side,
content-hash dedup).
2026-04-20 18:13:56 -05:00
root
468798c9ac /spec: technical specification — 11-chapter README-equivalent
J's ask: explain the full architecture so someone reading a README
can dispute it or recreate it. The repo isn't public yet; this page
IS the spec until it is.

Ch1 Repository layout — 13 crates + tests/multi-agent + docs + data,
    with owned responsibility and file path per crate.

Ch2 Data ingest pipeline (8 steps) — sources (file/inbox/DB/cron),
    parse+normalize with ADR-010 conservative typing, PII auto-tag,
    dedup, Parquet write, catalog register with fingerprint gate,
    mark embeddings stale, queryable immediately.

Ch3 Measurement & indexing — row count / fingerprint / owner /
    sensitivity / freshness / lineage per dataset. HNSW vs Lance
    tradeoff table with measured numbers (ADR-019). Autotune loop.
    Per-profile scoping (Phase 17).

Ch4 Contract inference from external signal — Chicago permit feed
    → role mapping → worker count heuristic → timeline → hybrid
    search with boost → pattern discovery → rendered card. All
    pre-computed before staffer opens UI.

Ch5 What a CRM can't do — 11-row comparison table of capabilities.

Ch6 How it gets better over time — three paths:
    - Phase 19 playbook boost (full math)
    - Pattern discovery meta-index
    - Autotune agent

Ch7 Scale story: 20 staffers, 300 contracts, midday +20/+1M surge
    - Async gateway + per-staffer profile isolation + client blacklists
    - 7-step surge handling flow (ingest, stale-mark, incremental refresh,
      degradation, hot-swap, autotune re-enter)
    - Known pain points: Ollama inference serial, RAM ceiling ~5M on
      HNSW (mitigated by Lance), VRAM 1-2 models sequential,
      playbook_memory unbounded.

Ch8 Error surfaces & recovery — 10-row table covering ingest schema
    conflicts, bucket failures, ghost names, dual-agent drift,
    empty searches, Ollama down, gateway restart, schema fingerprint
    divergence. Every failure has a named surface and recovery path.

Ch9 Per-staffer context — active profile, workspace, client blacklist,
    audit trail, daily summary. How 20 staffers don't see the same UI.

Ch10 Day in the life — 07:00 housekeeping → 07:30 refresh → 08:00
     staffer opens → 08:15 drill down → 08:30 Call click → 09:00
     second staffer shares memory → 12:30 surge → 14:00 no-show →
     15:00 new embeddings live → 17:00 retrospective → 22:00
     overnight trials.

Ch11 Known limits & non-goals — deferred (rate/margin, push, confidence
     calibration, neural re-ranker, pm compaction, call_log cross-ref)
     and explicitly out-of-scope (cloud, ACID, streaming, CRM replace,
     proprietary formats, hard multi-tenant).

Also: nav updated on /dashboard, /console, /proof to link /spec.
Every architectural claim in the spec cites either a code path, an
ADR number, or a phase reference so someone skeptical can target
the specific artifact.
2026-04-20 17:56:18 -05:00
root
76bfa2c8d7 /proof: explain the dual-agent recursive architecture with citations
Previous page was numeric claims without explanations — 'sub-100ms SQL',
'500K vectors in 341ms' etc. Accurate but undefendable without math,
code paths, and ADR references. Expanded to 8 chapters:

Ch1 — Live receipts (unchanged: real gateway tests, pass/fail, timing)

Ch2 — Architecture. 13-crate diagram with per-crate responsibility
      table and file paths. gateway → catalogd/queryd/vectord/ingestd
      + aibridge → object_store. References ADRs 1-20.

Ch3 — Dual-agent recursive consensus loop (NEW)
      - Role specialization (executor=optimist, reviewer=pessimist)
      - Parallel orchestration via Promise.all
      - Recursive: sealed playbooks feed playbook_memory → next query
      - Termination math: sealed | tool-error abort | drift abort |
        turn-cap abort — every path dumps forensic log
      - File refs: tests/multi-agent/agent.ts, orchestrator.ts,
        scenario.ts, run_e2e_rated.ts

Ch4 — Playbook memory feedback loop (NEW)
      - PlaybookEntry shape with embedding
      - Full boost math: similarity * base_weight * decay * penalty
        / n_workers, capped at MAX_BOOST_PER_WORKER
      - Temporal decay (e^-age/30, 30d half-life)
      - Negative signal (0.5^failures)
      - Why k=200: narrow cosine discrimination in nomic-embed-text
      - Evidence: compounding test 0 → 0.250 cap in 3 seeds
      - persist_sql write-through
      - Pattern discovery (Path 2 meta-index)
      - File: crates/vectord/src/playbook_memory.rs

Ch5 — ADR citations for each key choice
      ADR-001, 008, 012, 015, 019, 020 + Phase 19 design note

Ch6 — Live scale data (unchanged: pulled from /proof.json)

Ch7 — Reproduction recipes: curl for health, sql, hybrid with boost,
      patterns, pm stats, and the full dual-agent scenario run

Ch8 — Honest limits (unchanged: synthetic workers_500k, 1K candidates
      misaligned to call_log, 7B model imperfection, no rate/margin)

Every architectural claim now cites either the code path
(crates/.../src/file.rs::fn_name) or the ADR (docs/DECISIONS.md).
Someone disputing the system has specific targets to attack.

Mechanism unchanged: /proof serves mcp-server/proof.html via
Bun.file. /proof.json still returns the live test data the page
consumes client-side.
2026-04-20 17:49:08 -05:00