diff --git a/mcp-server/console.html b/mcp-server/console.html index 5fc2eb8..2039283 100644 --- a/mcp-server/console.html +++ b/mcp-server/console.html @@ -96,6 +96,7 @@ details .body{padding-top:10px;font-size:12px;color:#8b949e} Dashboard Walkthrough Architecture + Spec
| Path | Owns |
|---|---|
| crates/shared/ | Types, errors, Arrow helpers, schema fingerprints, PII detection, secrets provider. Every other crate depends on this. |
| crates/storaged/ | Raw object I/O. BucketRegistry (multi-bucket, rescue-aware), AppendLog (write-once batched append), ErrorJournal (bucket op failures). ADR-017 (federation), ADR-018 (append pattern). |
| crates/catalogd/ | Metadata authority. Dataset manifests, schema fingerprints (ADR-020), tombstones (soft delete), AI-safe views, model profiles (Phase 17). In-memory index persisted as Parquet on storage. |
| crates/queryd/ | SQL engine. DataFusion over Parquet + MemTable cache + delta merge-on-read + compaction. Registers every bucket as an object_store so SQL can join across them. |
| crates/ingestd/ | Data on-ramp. CSV / JSON / PDF (+OCR via Tesseract) / Postgres streaming / MySQL streaming / inbox watcher / cron schedules. Every ingest path auto-tags PII (emails, phones, SSNs, addresses), records lineage, and marks embeddings stale. |
| crates/vectord/ | The vector + learning surface. Embeddings stored as Parquet (ADR-008), HNSW index (Phase 15), trial system (autotune), promotion registry (Phase 16), playbook_memory (Phase 19). Core feedback loop lives here. |
| crates/vectord-lance/ | Firewall crate. Lance 4.0 + Arrow 57, isolated from the main Arrow-55 workspace. Provides secondary vector backend for large-scale, random-access, and append-heavy workloads (ADR-019). |
| crates/journald/ | Append-only mutation event log (ADR-012). Every insert/update/delete writes here — who, when, what, old/new value. Never mutated. Foundation for time-travel + compliance audit. |
| crates/aibridge/ | Rust ↔ Python sidecar. HTTP client over FastAPI wrapper around Ollama. VRAM introspection via nvidia-smi. All LLM calls (embed, generate, rerank) flow through here. |
| crates/gateway/ | Axum HTTP (:3100) + gRPC (:3101). Auth middleware, tools registry (Phase 12 — governed actions), CORS. Every external request enters here. |
| crates/ui/ | Dioxus WASM developer UI. Internal tool. Not exposed externally. |
| mcp-server/ | Bun/TypeScript recruiter-facing app. Serves devop.live/lakehouse. Routes: /search /match /log /log_failure /clients/:c/blacklist /intelligence/*. Proxies to the Rust gateway for heavy work. |
| tests/multi-agent/ | Dual-agent scenario harness. agent.ts (prompts + protocol), orchestrator.ts (single task), scenario.ts (5-event warehouse week), run_e2e_rated.ts (parallel pairs + rating), chain_of_custody.ts (layer-by-layer audit). |
| docs/ | PRD.md, PHASES.md, DECISIONS.md (20 ADRs). Every significant architectural choice has an ADR with the alternatives that were rejected and why. |
| data/ | Default local object store. Parquet files per dataset, append-log batches, HNSW trial journals, promotion registries, playbook_memory state.json, catalog manifests. Rebuildable from repo + this dir alone. |
POST /ingest/file, (b) inbox watcher (drops in ./inbox/ → auto-ingested in under 15s), (c) Postgres or MySQL streaming connector (POST /ingest/db with DSN), (d) scheduled ingest via ingestd::schedule with cron.String on ambiguity (ADR-010 — better to ingest everything than reject on type mismatch). JSON parser flattens nested objects. PDF extractor uses lopdf first; falls back to Tesseract OCR for scanned/image PDFs. Output is always an Arrow RecordBatch.shared::pii scans column values and names. Identifies emails, phone numbers, SSNs, salaries, street addresses, medical terms. Tags columns with sensitivity: PII | PHI | Financial | Internal | Public (Phase 10 catalog v2).arrow_helpers::record_batch_to_parquet → storaged::ops::put → file lands under data/datasets/<name>.parquet (or bucket-scoped via BucketRegistry). Schema fingerprint computed.catalogd::Registry::register(name, fingerprint, objects) — idempotent on (name, fingerprint). Same name + same fingerprint = reuse manifest, bump updated_at. Same name + different fingerprint = 409 Conflict (ADR-020 — prevents silent schema drift). New name = create new manifest with owner, lineage, freshness SLA, column metadata, PII tags.Registry::mark_embeddings_stale flips a flag; POST /vectors/refresh/<dataset> runs an incremental re-embed (only new rows, not the whole corpus).queryd::context picks up the new manifest on next query. Hot-cache warms on first hit. Delta merge-on-read means updates land without rewriting the base Parquet.Two backends, chosen per profile (ADR-019):
+| HNSW over Parquet (primary) | Lance (secondary) | |
|---|---|---|
| Storage | Embeddings as Parquet columns (doc_id, chunk_text, vector) | Lance native dataset |
| Index | HNSW in RAM, serialized sidecar | IVF_PQ on disk |
| Build time (100K × 768d) | ~230s | ~16s (14× faster) |
| Search p50 (100K) | ~873μs | ~7.4ms at recall 1.0 |
| Append | Rewrite required | Structural (0.08s for 100 rows) |
| Random fetch by doc_id | Full scan | ~311μs (112× faster) |
| RAM ceiling | ~5M vectors | Scales past RAM — disk-resident |
The vectord::agent background task runs continuously. Per index, it proposes HNSW configurations (ef_construction × ef_search), executes a trial against a stored eval set, journals the result as JSONL, and — if recall beats the min_recall gate (0.9) and latency wins the Pareto test — promotes the new config atomically via promotion_registry. No downtime. Rollback in milliseconds.
Model profiles (Phase 17) are not routing strings — they are named scopes. Each profile has bound_datasets[], hnsw_config, vector_backend, and bucket. When a staffer activates a profile:
POST /vectors/profile/<id>/search rejects out-of-scope queries with 403 + list of allowed bindingskeep_alive=0; only one model in VRAM at a timeThe concrete example running on devop.live/lakehouse is Chicago Department of Buildings permit data (public Socrata API). Every permit is a signal that construction — and therefore staffing — is coming.
/intelligence/market and /intelligence/permit_contracts hit data.cityofchicago.org/resource/ydr8-5enu.json live. No caching of permit data — every page load is fresh.overdue, urgent, soon, scheduled.POST /vectors/hybrid with sql_filter on role+state+city+availability, use_playbook_memory: true, playbook_memory_k: 200. Returns top-5 candidates with boost + citations.POST /vectors/playbook_memory/patterns aggregates traits across similar past playbooks — recurring certs, skills, archetype, reliability distribution. Surfaces signal the operator didn't query for./intelligence/staffing_forecast aggregates the last 30 days of permits into predicted role-level demand, joins against the IL bench supply, computes coverage %, and classifies each role as critical / tight / watch / ok. The dashboard's top panel renders this — staffers see supply gaps before they query.
| Capability | CRM | This system |
|---|---|---|
| Store candidate records | Yes | Yes (workers_500k, candidates) |
| Search by structured field | Yes | Yes (DataFusion SQL, sub-100ms on 3M rows) |
| Search by semantic meaning | No | Yes (HNSW + nomic-embed-text) |
| Combine SQL filter + semantic rank | No | Yes (/vectors/hybrid) |
| Boost workers based on past success | No | Yes (Phase 19 playbook_memory) |
| Penalize workers based on past failure | No | Yes (/log_failure + 0.5n penalty) |
| Surface traits across past fills | No | Yes (/vectors/playbook_memory/patterns) |
| Predict staffing demand from external data | No | Yes (Chicago permit feed + 30-day rolling forecast) |
| Count down to staffing deadline per contract | No | Yes (permit issue_date + heuristic timeline) |
| Explain why each candidate ranked | No | Yes (boost chip + narrative citations + memory pattern) |
| Improve ranking from operator actions | No | Yes (every Call/SMS/No-show click → re-rank signal) |
Every sealed fill is seeded to playbook_memory via /vectors/playbook_memory/seed. The next hybrid query for a semantically similar role+geo surfaces the past endorsed workers with a boost. Math:
per_worker = cosine(query_emb, playbook_emb) × 0.5 × e^(-age/30) × 0.5^failures / n_workers +boost[(city, state, name)] = min(Σ per_worker, 0.25)+
Caps, decay, and negative signal mean one popular worker can't dominate, old playbooks fade, and no-shows stop boosting. Verified live: 3 identical seeds → +0.250 boost capped, 3 citations.
+ +/vectors/playbook_memory/patterns goes beyond "who was endorsed" to answer "what did past similar fills have in common?" Aggregates recurring certifications, skills, archetype, reliability distribution across the top-K semantically similar playbooks. Surfaces signal the operator didn't explicitly query for.
The vectord::agent background task runs continuously. Watches the HNSW trial journal, proposes configs, executes trials, promotes Pareto winners — without human intervention. Operator sees "the index got faster overnight" and doesn't know why. The journal knows why.
Axum is async. The gateway handles concurrent requests on Tokio with work-stealing. No per-request thread. Tested at 10 parallel queries in 82ms total on this hardware.
+Per-staffer profile isolation. Each staffer activates their own profile (Phase 17) or workspace (Phase 8.5). Profile scopes their search to bound datasets. Workspace carries their in-progress contracts across sessions.
+Per-client blacklists. Auto-applied when the caller passes client: "X" on /search. Staffer A filling for Acme never sees Acme's flagged workers. Staffer B filling for MidState sees them normally.
SQL on job_orders is cheap. 300 rows is nothing — a scan is microseconds.
Workspace per contract. Each contract gets its own workspace with saved searches, shortlists, activity log. Zero-copy handoff between staffers (pointer swap, not data copy).
+Forecast remains coherent. /intelligence/staffing_forecast aggregates 30-day permit data regardless of contract count. The bench supply query (GROUP BY role over workers_500k) is a single sub-second SQL.
The delta arrives at 12:30. Here's what happens in the following minutes:
+mark_embeddings_stale flips the flag.POST /vectors/refresh/workers_500k reads only the new rows (diff against existing embeddings), embeds them in batches of 64 via Ollama, writes delta embedding Parquet. Measured on threat_intel: 34 new rows in 970ms (6× faster than full re-embed).promotion_registry atomically flips the active pointer. Next search hits the new config. Rollback stays available.DatasetAppended trigger and schedules a fresh HNSW trial cycle against the expanded index.vector_backend: lance flip — disk-resident IVF_PQ scales past the RAM line (ADR-019).keep_alive=0). Phase 17 profile activation unloads the prior model on swap.| Failure mode | Surface / response | Recovery |
|---|---|---|
| Ingest receives file with schema mismatch vs existing dataset | 409 Conflict with both fingerprints named (ADR-020) | Re-ingest under a new name, or migrate the existing via Phase 14 schema evolution |
| Bucket unreachable on write | Hard 503, error journaled to primary://_errors/bucket_errors/ | GET /storage/errors lists failures; GET /storage/bucket-health shows per-bucket status |
| Bucket unreachable on read | Rescue bucket fallback, X-Lakehouse-Rescue-Used: true header on response | Response still succeeds; operator sees rescue flag |
| /log receives name that doesn't exist in workers_500k | Seed is SKIPPED; response includes rejected_ghost_names: [...] and a note | Operator sees exactly which names were rejected and why |
| Dual-agent executor malforms tool call | Result appended to log with error field; counter increments | After 3 consecutive: abort with full log dump at tests/multi-agent/playbooks/<id>-FAILED.json |
| Dual-agent drifts from target | Reviewer verdict = drift, counter increments | After 3 consecutive drifts: abort with full log |
| Hybrid search finds zero candidates | Returns empty sources[] + sql_matches: 0 | Gap signal captured by scenario runner; operator prompted to broaden filter |
| Ollama sidecar down | 502 Bad Gateway from aibridge; embed calls fail fast | Restart: systemctl restart lakehouse-sidecar; vector search falls back to pre-computed embeddings |
| Gateway restart mid-operation | In-memory state (playbook_memory, HNSW) reloaded from persisted state.json / trial journals | Zero data loss; catalog, storage, journals are all source-of-truth |
| Schema fingerprint diverges across manifests | catalog::dedupe reports DedupeReport with winner selection (non-null row_count first, then newest updated_at) | POST /catalog/dedupe collapses duplicates idempotently |
Scopes every search. A staffing-recruiter profile bound to workers_500k sees only that dataset. A security-analyst profile bound to threat_intel cannot see worker data. GET /vectors/profile/<id>/audit records every tool invocation by model identity.
Per-contract state. Each workspace has daily/weekly/monthly tiers, saved searches, shortlists, activity logs. Survives across sessions. Instant zero-copy handoff between staffers — pointer swap, not data copy. Persisted to object storage, rebuilt on startup.
+ +Per-client worker exclusion. Populated via POST /clients/:client/blacklist. Auto-applied when the caller passes client: "X" on /search. JSON-backed; would move to catalog table under real client load.
Phase 12 tool registry logs every governed-action invocation (who called what, with what args, when, outcome). GET /tools/audit queryable. Phase 13 access control layers on top — role-based field masking, query audit log.
Workspace activity log + per-staffer filter on the event journal gives "what did Sarah do today" as a direct query. The foundation for shift-handoff reports.
+/log fires → playbook_memory.seed → persist_sql → successful_playbooks_live grows by one. Button flashes "Logged" for 1.4s. No modal, no form, no second click./log_failure → mark_failed records a penalty. Next similar query dampens Dave's boost by 0.5. Sarah continues the refill — the refill excludes Dave and the 2 others already booked for this shift.tests/multi-agent/scenario.ts gets report.md auto-generated. Workspace activity logs aggregate per staffer. GET /vectors/playbook_memory/stats shows the day's new entries.After each sealed fill (via scenario.ts or manual /log flow with downstream hooks), generateArtifacts in the scenario runner produces: (a) one SMS per worker (TO: Name, message under 180 chars), (b) one client confirmation email. Drafts are saved to sms.md and emails.md under the scenario output dir. Ollama drafts them; the staffer reviews and sends. No auto-send; human-in-the-loop.
pay_rate to workers, bill_rate to contracts, and a filter + warning path. Phase 20 item.