76 Commits

Author SHA1 Message Date
root
2155959013 Worker profile modal: click any worker to see full details
Click any worker avatar/card → scrollable modal with:
- Rich profiles: reliability/availability bars with explanations,
  skill tags, cert badges, archetype with description, work history,
  Call/SMS action buttons
- Sparse profiles: trust path showing 'You are here' → progression
  to full profile through normal operations
- Modal scrolls independently, background locked
- Close via X button or click outside

Each archetype has a plain-English description:
  reliable: 'Consistently shows up, clients request them back'
  leader: 'Takes initiative, helps train others'
  erratic: 'Inconsistent attendance, needs monitoring'
  etc.

Work history shows recent placements and cert renewals.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 16:25:42 -05:00
root
45a95a9feb Urgent pipeline: step-by-step workflow walks staffer through emergency fills
Urgent contracts now show a 4-step action plan:
  Step 1 (red): Review pre-matched workers
  Step 2 (yellow): Call first choice — highest match score
  Step 3 (blue): Confirm or replace — backup is ready
  Step 4 (green): Send shift details to confirmed workers

First-choice worker highlighted with red border + label.
Backup workers shown with dimmed styling + 'BACKUP' label.
Urgent cards show ALL matched workers + backups (not just 3).

Non-urgent contracts split into 'In Progress' (still filling)
and 'Ready to Go' (fully staffed) sections.

The staffer doesn't stare at a red label wondering what to do.
They follow the steps: review, call, confirm, send. Done.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 16:08:18 -05:00
root
c0ff7434cb Technical deep-dive: architecture explained for non-technical audience
Added 'How This Actually Works' section below the proof page:

1. CRM vs Lakehouse side-by-side — what's different in plain English
2. Your Data Never Leaves — local AI, local storage, your hardware
3. How It Handles Scale — HNSW (RAM, 1ms) + Lance (disk, 5ms at 10M)
4. Hot-Swap Profiles — 4 AI models explained by what they DO
5. Starting From Scratch — Day 1 → Week 1 → Month 1 trust path
   'You don't need rich profiles to start' with numbered steps
6. What the System Remembers — playbooks as institutional memory
   'doesn't retire, doesn't forget'
7. Measured Not Promised — table of real numbers with plain English

Addresses the legacy company pushback: explains WHY the architecture
matters, HOW sparse data becomes rich data over time, and that
everything runs on hardware they own with zero cloud dependency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 15:56:16 -05:00
root
bb46869227 Intelligence-first UI: insights, not data dumps
Complete rebuild around 'how did it know that?' moments:

1. NEEDS YOUR ATTENTION — urgent contracts with pre-matched workers.
   Each worker shows WHY they were matched: 'Reliable (85%) ·
   Certified: OSHA-10 · Same city as job site'

2. READY TO CONFIRM — fully matched contracts, just review and send

3. YOUR STRONGEST WORKERS — 95%+ reliability, 'they rarely
   no-show and clients request them back'

4. BENCH STRENGTH ALERT — states with thin reliable worker pools,
   'consider recruiting in these areas'

Every section has: a label (ACTION NEEDED/READY/INSIGHT/HEADS UP),
a headline in plain English, an explanation of HOW the system
knows this, and actionable workers with Call/SMS buttons.

This is what a CRM has never done: anticipate, explain, recommend.
The staffer doesn't search — they respond to intelligence.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 15:46:25 -05:00
root
2279d9f51d Fix: simulation now passes chunk_text — worker cards show full profiles
The simulation was only storing name/doc_id/score but dropping
chunk_text. Worker cards showed 'New — data builds with placements'
for every worker. Now includes the full profile text so cards render
skills (blue), certs (green), archetype (purple), and reliability/
availability meters.

Verified via Playwright: cards now show DeShawn Cook with 6S|Excel|SAP
skills, First Aid/CPR cert, flexible archetype, 72% reliability.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 15:41:30 -05:00
root
875cfadc3d Graceful sparse data: show what exists, hide what doesn't
Worker cards now handle sparse-to-rich data gracefully:
- Name only? Shows name + 'New — data builds with placements'
- Name + role? Shows name + role tag
- Name + role + skills + certs? Shows full tag row
- Has reliability data? Shows colored meter bars
- No metrics? No empty bars, no 0% — just what's there

Contract cards: urgency dot, progress bar, fill count.
Workers inside: avatar initials, name, role, location, skill/cert
tags (blue/green), archetype (purple), reliability/availability
bars — all ONLY when data exists.

GitHub-style dark theme. Call/SMS per worker. Search collapsed.
ADR-021 compliant: works with a name and earns everything else.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 15:36:53 -05:00
root
845acfdcda Rich worker cards: skills, certs, reliability bars — not just names
Each worker in a contract card now shows:
- Initials avatar (color-coded)
- Name + location on same line
- Skill tags (blue pills, top 3 relevant)
- Cert badges (green pills — OSHA, Forklift, Hazmat)
- Archetype tag (purple — reliable, leader, etc)
- Reliability bar with color (green >80%, yellow >50%, red <50%)
- Availability bar with color
- Individual Call/SMS buttons per worker

Contract headers show:
- Urgency dot (red/yellow/blue/green)
- Client name, role × headcount, location, start time
- Progress bar with fill count

GitHub-style dark theme. Every piece of info visible at a glance
without clicking anything. The staffer sees skills, certs, and
reliability for every matched worker the moment the page loads.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 15:27:27 -05:00
root
05785b4628 Dashboard: the staffer's actual workday, not a search box
Not a CRM search page. A staffing workstation:

Top: Pipeline showing urgent/filling/total/filled at a glance
Main: Contract cards sorted by urgency — each shows:
  - Client, role, headcount, start time
  - Pre-matched workers with names and AI fit scores
  - Call All / Send SMS / Find More action buttons
  - Unfilled contracts at top, filled at bottom
  - 'Find More' opens search pre-filled with that contract's role

Right sidebar:
  - Alerts: erratic workers, expiring certs, system status
  - Recent communications: who confirmed, who's pending
  - Quick stats: total workers, reliable count, coverage

The search is there but collapsed — it's a tool, not the focus.
When they open the page, their day is already organized.

This is what the CRM doesn't do: anticipate, pre-match, organize.
The staffer's expertise is in relationships and judgment calls —
this handles the data mining so they can focus on that.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 15:22:18 -05:00
root
7cb9999451 Rebuild search UI: zero dependencies, plain JS, DOM-only, works
Replaced complex dashboard with minimal search.html:
- No external JS/CSS files, no transpilation, no module imports
- Plain JS with .then() chains (no async/await compat issues)
- DOM-only rendering via createElement (no innerHTML with data)
- 20s AbortController timeout so fetch never hangs
- Detects /lakehouse/ proxy prefix automatically
- 7KB total, loads in 18ms

Calls lakehouse /vectors/hybrid directly — SQL filters always apply,
works even when HNSW isn't loaded (brute-force fallback).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 13:26:27 -05:00
root
e7e988dcc0 Fix dashboard: always use hybrid (no HNSW dependency), 15s timeout, error display
The search hung because pure AI mode calls HNSW which is RAM-only —
gone after every lakehouse restart. Now ALL AI/hybrid searches go
through the /search endpoint which uses brute-force when HNSW isn't
loaded. Added 15s AbortController timeout so fetch never hangs.
Added window.onerror handler to show JS errors on page.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 13:23:29 -05:00
root
5c93338f40 Fix: gateway defaulted to wrong vector index (10K instead of 50K)
All gateway endpoints pointed to ethereal_workers_v1 (10K, W- prefix)
instead of workers_500k_v1 (50K, W500K- prefix). Filters appeared
broken because the vector results came from the wrong dataset —
IDs matched numerically but belonged to different workers.

Now: every search, match, and hybrid call uses workers_500k_v1.
Verified: 'experienced welder' + state=OH + role=Welder returns
5 Welders in OH (Carmen Perry, Janet White, Rachel Miller, etc).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 13:16:11 -05:00
root
f9e2a0bbbe Fix: filters now ALWAYS work — auto-switches to hybrid when set
The bug: selecting a state filter in AI Search mode did nothing
because HNSW vector search has no concept of SQL WHERE clauses.
Results came back from any state.

The fix: when ANY filter is set (state, role, or reliability > 0.5),
the search automatically switches to hybrid mode which runs the SQL
filter first, then AI-ranks within the filtered set. Users don't
need to know about modes — filters just work.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 13:10:28 -05:00
root
6a2cc0fb8f Search UI: type what you need, see real workers — no more taking my word for it
Rebuilt the dashboard into a live search interface anyone can use:
- Big search box: type in plain English, hit Enter or click Search
- 3 modes: AI Search, CRM Keyword, Hybrid (best)
- Clickable examples: 'warehouse help', 'dependable machine operator', etc
- Filters: state, role, min reliability
- Results show: name, role, location, skills, certs, reliability, AI match score
- Hybrid results marked 'SQL verified against database'
- CRM mode shows 0 results with a prompt to try AI Search
- Mobile responsive

This is the answer to 'we just have to take your word for it.'
Type anything. See real workers. Compare CRM vs AI side by side.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 13:06:31 -05:00
root
48c7c1c5e6 Fix dashboard: detect /lakehouse/ nginx prefix for API calls
dashboard.ts now checks if running behind the nginx proxy (path
starts with /lakehouse) and prepends the prefix to all API calls.
Without this, the browser called /sql instead of /lakehouse/sql
and got 404s from the LLM Team Flask app.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 13:04:24 -05:00
root
7367e5f71d Proof page: LIVE side-by-side CRM vs AI — shows, doesn't tell
3 live demo searches run on page load against 500K real profiles:
  'warehouse help' — CRM: 0, AI: finds Forklift Ops + Loaders
  'someone good with machines who is dependable' — CRM: 0, AI: finds Machine Ops
  'safety trained worker for chemical plant' — CRM: 0, AI: finds OSHA+Hazmat workers

Each shows the actual CRM keyword count (LIKE match) next to the AI
vector results with real worker names, roles, and cities. Not
described — demonstrated. The numbers come from queries that run
when the page loads.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 12:55:11 -05:00
root
66a3460c92 Dashboard rebuilt: matches proof page design, mobile-ready
Clean dark theme matching /proof page. Priority badges on contracts
(urgent=red, high=yellow, medium=blue, low=green). Worker matches
shown inline. Day tabs show fill counts. Alerts with icons. Playbook
entries styled. All styles inline — no separate CSS file.

Mobile responsive: single column layout, scrollable tabs.
Links to /proof at bottom.

https://devop.live/lakehouse/ — the dashboard
https://devop.live/lakehouse/proof — the proof page

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 12:51:08 -05:00
root
5aaa3c5c08 Mobile responsive: proof page works on phones
Added @media(max-width:768px) breakpoints:
- 2-col grids → single column on mobile
- 3-col grids → single column
- 4-col model cards → 2-col
- Stats grid → 2-col
- Tables: horizontal scroll, smaller text
- Reduced padding and font sizes
- Hero title scales down

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 12:44:57 -05:00
root
c53d3f4d14 Proof page: speaks to the staffer, not the engineer
Rebuilt the page to address a staffing coordinator who's tired of
learning new tools. Opens with "Your Morning Just Got Easier" and
a side-by-side: their current 45-minute routine vs 5 minutes with
pre-matched workers.

Key messaging:
- "This isn't another CRM to learn"
- "We know what your day looks like" (checklist they'll recognize)
- Shows real matched workers WITH names, not abstract metrics
- "It understands what you mean" — warehouse help finds forklift ops
- "It already filtered the junk" — only workers worth calling
- "It runs on YOUR machine" — no cloud, no fees, no data leaving

Technical proof pushed below a divider for the skeptical team.
The staffer sees their contracts and their workers first.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 12:40:07 -05:00
root
dd344c9b38 Proof page: CRM vs AI side-by-side — shows what keywords can't do
Rebuilt /proof to highlight the actual differentiator:
- Section 01: "What a CRM Does" — SQL keyword search, every CRM has this
- Section 02: "What AI + Vectors Do" — semantic understanding.
  Side-by-side: CRM finds 0 results for "warehouse work" because no
  profile contains that exact text. AI finds 5 verified workers because
  it understands Forklift Operator + Loader = warehouse work.
- Section 03: 673K vectorized chunks, 98% recall, 10M at 5ms
- Section 04: Local GPU, 4 models, no cloud, no API fees

The point: this isn't another CRM search. It's an intelligence layer
that understands MEANING — and it runs entirely on your hardware.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 12:27:46 -05:00
root
8d9c04a323 Proof page: styled HTML at /proof for team verification
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 12:23:04 -05:00
root
cd1fda3e21 Fix: CORS + relative URL + Langfuse tracing wired into gateway
Three fixes:
1. CORS headers on all gateway responses (browser dashboard was
   blocked by same-origin policy)
2. Dashboard JS uses window.location.origin instead of hardcoded
   localhost:3700 (LAN browsers couldn't reach it)
3. Langfuse tracing wired into every gateway request — api() wrapper
   creates spans for each lakehouse call, logGeneration for LLM calls.
   Week simulation now produces 34 observations per run visible in
   Langfuse UI.

7 traces confirmed in Langfuse after restart. Every /sql, /search,
/vram, /simulation call is tracked with timing + inputs + outputs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 00:53:18 -05:00
root
4a2bfce6e0 Week simulation + live dashboard + self-orientation + verification
Week simulation engine: 5 business days, 4-8 contracts per day,
3 rotating staffers with handoffs between days. Runs hybrid search
per contract via the gateway. 28 contracts, 108/108 filled (100%),
5 emergencies, 4 handoffs, 3.2s total.

Dashboard at :3700/ — dark theme, shows:
  - Contract cards sorted by priority with match status
  - Day navigation across the work week
  - Week summary stats (fill rate, emergencies, handoffs)
  - Live alerts (erratic/silent workers)
  - Playbook entries
  - Real-time service health + VRAM

Self-orientation (/context) + verification (/verify) endpoints so
any agent can understand the system and fact-check claims without
human intermediary.

Accessible on LAN at http://192.168.1.177:3700

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 00:45:46 -05:00
root
a001a21902 MCP self-orientation: /context + /verify + architecture resources
Any agent (Claude Code via MCP stdio, or sub-agents via HTTP :3700)
can now self-orient without human explanation:

GET /context returns:
  - System purpose and name
  - All datasets with row counts
  - All vector indexes with backends
  - Available models and their strengths
  - Complete tool list with rules
  - Current VRAM state

POST /verify fact-checks any claim about a worker against the golden
data. Agent says "worker 1313 is a Forklift Operator in IL with
reliability 0.82" → endpoint returns verified=true/false with exact
discrepancies.

MCP resources (stdio path for Claude Code):
  - lakehouse://system — live system status
  - lakehouse://architecture — full PRD
  - lakehouse://instructions — agent operating manual
  - lakehouse://playbooks — successful operations database
  - lakehouse://datasets — dataset listing

This is the "command and control" layer J asked for: any agent
connecting to this system gets the context it needs to operate
independently. No human intermediary required.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 00:41:46 -05:00
root
67ab6e4bac Langfuse observability — every LLM call traced and scored
Langfuse v2.95.11 running on :3001 (Docker + Postgres).
Login: j@lakehouse.local / lakehouse2026

tracing.ts: startTrace → logGeneration/logRetrieval/logSpan → scoreTrace → flush.
Every hybrid search, SQL generation, RAG pipeline, and co-pilot
briefing gets a full trace: model, prompt, output, latency, tokens.

The observer can now score traces based on verification results —
Langfuse aggregates accuracy over time so we can see which models
and approaches actually work in production, not just in tests.

Services: lakehouse(:3100) + sidecar(:3200) + agent(:3700) +
observer + langfuse(:3001) + minio(:9000) + mariadb(:3306)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 00:38:21 -05:00
root
b532ae61f1 Agent gateway + observer — autonomous internal operation
Three new systemd services:
- lakehouse-agent (:3700) — REST gateway wrapping all lakehouse tools.
  Clean JSON in/out, no protocol complexity. 9 endpoints: /search,
  /sql, /match, /worker/:id, /ask, /log, /playbooks, /profile/:id, /vram
- lakehouse-observer — watches operations, logs to lakehouse, asks
  local model to diagnose failure patterns, consolidates successful
  patterns into playbooks every 5 cycles
- Stdio MCP transport preserved for Claude Code integration

AGENT_INSTRUCTIONS.md: complete operating manual for sub-agents.
Rules: never hallucinate, SQL first for structured questions, hybrid
for matching, log every success, check playbooks before complex tasks.

Observer loop:
  observed() wrapper timestamps + persists every gateway call →
  error analyzer reads failures + asks LLM for diagnosis →
  playbook consolidator groups successes by endpoint pattern

All three designed for zero human intervention — agents operate,
observer watches, playbooks accumulate, iteration happens internally.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 00:00:08 -05:00
root
e1d48d3c8f MCP server (Bun) + 100K worker generator + lakehouse integration
MCP server at mcp-server/index.ts — 9 tools exposing the full
lakehouse to any MCP-compatible model:
  search_workers (hybrid SQL+vector), query_sql, match_contract,
  get_worker, rag_question, log_success, get_playbooks,
  swap_profile, vram_status

The "successful playbooks" pattern: log_success writes outcomes
back to the lakehouse as a queryable dataset. Small models call
get_playbooks to learn what approaches worked for similar tasks —
no retraining needed, just data.

generate_workers.py scales to 100K+ with realistic distributions:
  - 20 roles weighted by staffing industry frequency
  - 44 real Midwest/South cities across 12 states
  - Per-role skill pools (warehouse/production/machine/maintenance)
  - 13 certification types with realistic probability
  - 8 behavioral archetypes with score distributions
  - SMS communication templates (20 patterns)

100K worker dataset ingested: 70MB CSV → Parquet in 1.1s. Verified:
11K forklift ops, 27K in IL, archetype distribution matches weights.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 23:54:33 -05:00