lakehouse

Author	SHA1	Message	Date
root	95c26f04f8	Path 1 negative signal + Path 2 pattern discovery + name validation New: - /vectors/playbook_memory/patterns: meta-index pattern discovery. Given a query, finds top-K similar playbooks, pulls each endorsed worker's full workers_500k profile, aggregates shared traits (cert frequencies, skill frequencies, modal archetype, reliability distribution), returns a human-readable discovered_pattern. Surfaces signals operators didn't explicitly query — the original PRD's "identify things we didn't know" dimension. - /vectors/playbook_memory/mark_failed: records worker failures per (city, state, name). compute_boost_for applies 0.5^n penalty per recorded failure, so 3 failures quarter a worker's positive boost and 5 effectively zero it. Path 1 negative signal — recruiter trust depends on the system NOT recommending people who no-showed. - Bun /log_failure: validates failed_names against workers_500k (same ghost-guard as /log), forwards to /mark_failed. Improved: - /log now validates endorsed_names against workers_500k for the contract's city+state before seeding. Ghost names (names that don't correspond to real workers) are rejected in the response and excluded from the seed, preventing silent boost failures. - Bun /search auto-appends `CAST(availability AS DOUBLE) > 0.5` to sql_filter when the caller didn't constrain availability. Opt out with `include_unavailable: true`. Recruiter trust bug: surfacing already-placed workers breaks the first call. - DEFAULT_TOP_K_PLAYBOOKS 25 → 100. Direct cosine measurement showed similarities cluster 0.55-0.67 across all playbooks regardless of geo, so k=25 missed relevant geo-matched playbooks. Brute-force is still sub-ms at this size. Verified end-to-end on live data: - Ghost names rejected on /log + /log_failure - Availability filter drops unavailable workers from candidate pool - Pattern discovery on unseen Cleveland OH Welder query returned recurring skills (first aid 43%, grinder 43%, blueprint 43%) and modal archetype (specialist) across 20 semantically similar past playbooks in 0.24s - Negative signal: Helen Sanchez boost dropped +0.250 → +0.163 after 3 failures recorded via /log_failure (34% reduction)	2026-04-20 14:55:46 -05:00
root	20b0289aa9	/log validates endorsed names + /search auto-appends availability>0.5 Two gap-fills surfaced by the real test on 2026-04-20: 1. /log no longer seeds endorsed_names that don't exist in workers_500k for the contract's (city, state). Previously accepted ghost names silently (entry count grew, SQL row landed, but boost never fired because no real worker chunk matched the stored tuple). Response now reports rejected_ghost_names and explains why seeding was skipped. 2. Bun /search auto-appends `CAST(availability AS DOUBLE) > 0.5` to sql_filter when the caller didn't constrain availability themselves. Recruiters expect "available workers" by default — surfacing someone on an active placement would break trust on first contact. Opt out with `include_unavailable: true`. Verified: ghost names rejected end-to-end, real names accepted, mixed input handled correctly. Availability filter drops ~10 workers from a 305-row Cleveland OH Welder pool to 295 actually-available.	2026-04-20 14:44:12 -05:00
root	25b7e6c3a7	Phase 19 wiring + Path 1/2 work + chain integrity fixes Backend: - crates/vectord/src/playbook_memory.rs (new): Phase 19 in-memory boost store with seed/rebuild/snapshot, plus temporal decay (e^-age/30 per playbook), persist_to_sql endpoint backing successful_playbooks_live, and discover_patterns endpoint for meta-index pattern aggregation (recurring certs/skills/archetype/reliability across similar past fills). - DEFAULT_TOP_K_PLAYBOOKS bumped 5 → 25; old default silently missed most boosts when memory had > 25 entries. - service.rs: new routes /vectors/playbook_memory/{seed,rebuild,stats, persist_sql,patterns}. Bun staffing co-pilot (mcp-server/): - /search, /match, /verify, /proof, /simulation/run, MCP tools all forward use_playbook_memory:true and playbook_memory_k:25 to the hybrid endpoint. Boost was previously dark across the entire app. - /log no longer POSTs to /ingest/file — that endpoint REPLACES the dataset's object list, so single-row CSV writes were wiping all prior rows in successful_playbooks (sp_rows went 33→1 in one /log call). /log now seeds playbook_memory with canonical short text and calls /persist_sql to keep successful_playbooks_live in sync. - /simulation/run cumulative end-of-week CSV write removed for the same reason. Per-day per-contract /seed (added in this session) is the accumulating feedback path now. - search.html addWorkerInsight renders a green "Endorsed · N playbooks" chip with playbook citations when boost > 0. Internal Dioxus UI (crates/ui/): - Dashboard phase list rewritten through Phase 19 (was stuck at "Phase 16: File Watcher" / "Phase 17: DB Connector" — both wrong). - Removed fabricated "27ms" stat label. - Ask tab examples + SQL default replaced with real staffing prompts against candidates/clients/job_orders (was referencing nonexistent employees/products/events). - New Playbook tab exposes /vectors/playbook_memory/{stats,rebuild} and side-by-side hybrid search (boost OFF vs ON) with citations. Tests (tests/multi-agent/): - run_e2e_rated.ts: parallel two-agent (mistral + qwen2.5) build phase + verifier rating (geo, auth, persist, boost, speed → /10). - network_proving.ts: continuous build → verify → repeat with staffing-recruiter profile hot-swap; geo-discrimination check. - chain_of_custody.ts: single recruiter operation traced through every layer (Bun /search, direct /vectors/hybrid parity, /log, SQL, playbook_memory growth, profile activation, post-op boost lift).	2026-04-20 06:21:13 -05:00
root	8e3cac5812	Polish: professional layout, collapsible sections, tighter design - Replaced amateur CSS with professional dark theme (Inter font, muted palette, proper spacing, consistent border radius, hover states, transitions) - Nav bar with Dashboard/Intelligence Console/Architecture tabs - Urgent pipeline: shows contracts directly, removed busy step indicators - In Progress + Ready to Go: collapsed by default with expand toggle (page went from 30+ visible contract cards to just the urgents) - Workers Available: limited to 5 instead of 8 - Proper section headers with labels and metadata - Search section always visible with better placeholder text - Professional footer with product branding - Responsive breakpoints for mobile (768px, 480px) - Page is now ~50% shorter with same information density Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 20:29:45 -05:00
root	2da8562c90	Interactive permit heat map with live data verification - Leaflet.js map with dark tiles showing real Chicago building permits - Dots sized and colored by project cost ($1B+ red, $100M+ orange, $10M+ blue) - Hover any dot for project details — address, cost, description, date - LIVE indicator with green pulse dot - Timestamp showing when data was fetched - "Verify source" link goes directly to Chicago Open Data portal - "Refresh" button re-fetches from the API on click - Expanded to 50 permits for denser map coverage - Legend showing dot size scale No one can say "you just typed those numbers in" when they can click a dot on the map, see 10000 W OHARE ST, and verify it themselves on data.cityofchicago.org. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 20:24:43 -05:00
root	9acbe5c369	Market Intelligence: live Chicago building permits → staffing demand forecast /intelligence/market pulls real permit data from Chicago Open Data API: - $9.6B in active construction permits - O'Hare expansion ($730M), new casino ($580M), transit station ($445M) - Maps permit types to staffing roles (electrical→Electrician, masonry→Loader) - Cross-references with our IL worker bench to show coverage gaps - Electrician gap: only 1,036 reliable vs 63K estimated demand Datalake page now shows three intelligence layers: 1. Contract simulation with scenario-driven matching 2. Market Intelligence with live permit data + bench analysis 3. System Learning with fill history and detected patterns The staffing company sees demand forming before the phone rings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 20:12:01 -05:00
root	b16e485be1	Every page refresh feeds the learning loop — contracts logged as playbook entries Each simulation fill now logs: role, headcount, city, state, workers matched, client, start time, and scenario type. One page refresh = ~20 playbook entries. 4 refreshes = 28 entries with patterns already forming. Fixed activity counters: shows Contract Fills, Searches, and Patterns. Activity feed now shows the actual fill data with worker names and scenarios. This is the PRD's learning loop in action — the system records every successful match so future queries can learn from past decisions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 20:05:51 -05:00
root	bba5b826a3	Learning loop + smart search on datalake page Learning Loop: - /intelligence/learn endpoint logs search→selection as playbook entry - /intelligence/activity returns learning stats, patterns, and recent activity - Call/SMS buttons trigger logSelection() — records what query led to what pick - "System Learning" card on main page shows searches logged, patterns detected, and recent activity feed with timestamps - Every search-selection pair becomes institutional knowledge stored in the lakehouse Smart Search on Main Page: - doSearch() now routes through /intelligence/chat (smart NL parser) - Extracts role, city, state, availability, reliability from natural language - Shows understanding tags so staffer sees what the system parsed - Returns workers with ZIP codes, availability %, reliability %, archetype - "reliable forklift operator available in Nashville" → 10 Nashville forklift operators with ZIP codes, all 86-98% reliable, all available — 372ms Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 19:59:07 -05:00
root	df71ac7156	Smart NL search: extracts role, city, state, availability from natural language "find me a warehouse worker available today near Nashville" now: - Parses: role=warehouse, city=Nashville, available=true - Builds SQL: role LIKE '%warehouse%' AND city='Nashville' AND availability>0.5 - Returns: 12 Nashville warehouse workers with ZIP codes, availability %, reliability %, skills, certs, and archetype - Shows understanding tags so user sees what the system parsed - 414ms, 12 records — not a generic search, a targeted answer Recognizes 20 role keywords, 40+ cities, 10 states, availability/reliability signals from natural language. Falls through to vector search for anything the parser doesn't catch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 19:50:05 -05:00
root	37804d7195	Staffing Intelligence Console: workforce command center with conversational AI New page at /lakehouse/console — a $200/hr consultant's intelligence product: Morning Brief (auto-loads in ~120ms across 500K profiles): - Workforce Pulse: total, reliable %, elite %, archetype breakdown - Geographic Bench: state-by-state reliable % with weakest-state alert - Comeback Watch: 15K improving workers who crossed 80% reliability - Risk Watch: 5K erratic + 5K silent workers flagged automatically - Ready & Waiting: available + reliable workers to call first - Role Supply: 20 roles with supply/available/reliability Conversational Chat with 5 intelligent routes: - "Find someone like [Name] but in OH" → vector similarity search - "Who could handle industrial electrical work?" → semantic role discovery (finds workers for roles that DON'T EXIST in the database) - "What if we lose our top 5 forklift operators?" → scenario analysis with risk rating, bench depth, state-by-state breakdown - "Which workers should we stop placing?" → risk flagging - Default: hybrid SQL+vector search with LLM summary Every response shows: query steps, records scanned, response time. Transparency kills the "AI is making it up" argument. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 19:37:52 -05:00
root	37c68d9567	Kill all static/fake elements — every number on the page is now live from data Skeptic-proof audit: - Worker count queried from database (was hardcoded "500K") - State/role dropdowns populated from actual data (was hardcoded 8 states, 6 roles) - Now shows 11 states, 21 roles — whatever exists in the dataset - Client names generated combinatorially (20×20=400 combos, was 12 static) - Top workers randomized with SQL OFFSET (was same 5 every time) - Deleted fabricated "Recent Activity" section (fake placement history) - Replaced with transparent "Data Source" showing where numbers come from - Fixed NOTES undefined crash — hybrid search actually returns results now (was silently failing, showing 0/X filled on every contract) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 17:09:22 -05:00
root	e9b5498f43	Contextual insights: workers and bench strength driven by today's actual contracts - loadDay() now runs simulation first, extracts unfilled roles/states, then builds SQL queries filtered to what's actually needed today - "Workers Available for Today's Open Contracts" replaces generic top-5 list - Each worker shows which gap they fill: "Could fill 4 open Loader spots" - Bench Strength section scoped to states with active contracts + open slot counts - Every refresh produces different workers because contracts change each time Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 17:04:39 -05:00
root	be7436b6f0	Diverse scenario engine: 15 weighted staffing situations replace crisis-every-refresh Simulation now uses weighted random selection across 4 priority tiers: - Urgent (walkoff, quarantine, no-show), High (new client, cert expiry, expansion), Medium (recurring, seasonal, medical leave, cross-train), Low (future, exploratory) - Color-coded scenario banners on ALL contracts, not just urgent - Each scenario carries context (note) + recommended action Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 16:41:00 -05:00
root	e87155306b	Urgent explains WHY and WHAT TO DO — not just a red dot Urgent contracts now show: - Red banner with specific reason: 'Client called last night', 'Emergency coverage — 2 no-shows reported', 'Production surge', 'Original crew cancelled', etc. - Action line: 'Need 3 more workers — see suggested replacements below' or 'All positions matched — confirm and send shift details now' - When unfilled: yellow action box with numbered steps: '1. Call the workers above, 2. If someone declines the backup is ready, 3. Expand search to nearby states' - FIRST CHOICE worker highlighted with red border - BACKUP workers labeled and shown after the required headcount The staffer doesn't see a red circle and wonder. They see: 'Emergency coverage — 2 no-shows. Need 3 more. Here are your options. Call this person first. If they can't, here's the backup.' Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 16:32:50 -05:00
root	2155959013	Worker profile modal: click any worker to see full details Click any worker avatar/card → scrollable modal with: - Rich profiles: reliability/availability bars with explanations, skill tags, cert badges, archetype with description, work history, Call/SMS action buttons - Sparse profiles: trust path showing 'You are here' → progression to full profile through normal operations - Modal scrolls independently, background locked - Close via X button or click outside Each archetype has a plain-English description: reliable: 'Consistently shows up, clients request them back' leader: 'Takes initiative, helps train others' erratic: 'Inconsistent attendance, needs monitoring' etc. Work history shows recent placements and cert renewals. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 16:25:42 -05:00
root	45a95a9feb	Urgent pipeline: step-by-step workflow walks staffer through emergency fills Urgent contracts now show a 4-step action plan: Step 1 (red): Review pre-matched workers Step 2 (yellow): Call first choice — highest match score Step 3 (blue): Confirm or replace — backup is ready Step 4 (green): Send shift details to confirmed workers First-choice worker highlighted with red border + label. Backup workers shown with dimmed styling + 'BACKUP' label. Urgent cards show ALL matched workers + backups (not just 3). Non-urgent contracts split into 'In Progress' (still filling) and 'Ready to Go' (fully staffed) sections. The staffer doesn't stare at a red label wondering what to do. They follow the steps: review, call, confirm, send. Done. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 16:08:18 -05:00
root	c0ff7434cb	Technical deep-dive: architecture explained for non-technical audience Added 'How This Actually Works' section below the proof page: 1. CRM vs Lakehouse side-by-side — what's different in plain English 2. Your Data Never Leaves — local AI, local storage, your hardware 3. How It Handles Scale — HNSW (RAM, 1ms) + Lance (disk, 5ms at 10M) 4. Hot-Swap Profiles — 4 AI models explained by what they DO 5. Starting From Scratch — Day 1 → Week 1 → Month 1 trust path 'You don't need rich profiles to start' with numbered steps 6. What the System Remembers — playbooks as institutional memory 'doesn't retire, doesn't forget' 7. Measured Not Promised — table of real numbers with plain English Addresses the legacy company pushback: explains WHY the architecture matters, HOW sparse data becomes rich data over time, and that everything runs on hardware they own with zero cloud dependency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 15:56:16 -05:00
root	bb46869227	Intelligence-first UI: insights, not data dumps Complete rebuild around 'how did it know that?' moments: 1. NEEDS YOUR ATTENTION — urgent contracts with pre-matched workers. Each worker shows WHY they were matched: 'Reliable (85%) · Certified: OSHA-10 · Same city as job site' 2. READY TO CONFIRM — fully matched contracts, just review and send 3. YOUR STRONGEST WORKERS — 95%+ reliability, 'they rarely no-show and clients request them back' 4. BENCH STRENGTH ALERT — states with thin reliable worker pools, 'consider recruiting in these areas' Every section has: a label (ACTION NEEDED/READY/INSIGHT/HEADS UP), a headline in plain English, an explanation of HOW the system knows this, and actionable workers with Call/SMS buttons. This is what a CRM has never done: anticipate, explain, recommend. The staffer doesn't search — they respond to intelligence. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 15:46:25 -05:00
root	2279d9f51d	Fix: simulation now passes chunk_text — worker cards show full profiles The simulation was only storing name/doc_id/score but dropping chunk_text. Worker cards showed 'New — data builds with placements' for every worker. Now includes the full profile text so cards render skills (blue), certs (green), archetype (purple), and reliability/ availability meters. Verified via Playwright: cards now show DeShawn Cook with 6S\|Excel\|SAP skills, First Aid/CPR cert, flexible archetype, 72% reliability. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 15:41:30 -05:00
root	875cfadc3d	Graceful sparse data: show what exists, hide what doesn't Worker cards now handle sparse-to-rich data gracefully: - Name only? Shows name + 'New — data builds with placements' - Name + role? Shows name + role tag - Name + role + skills + certs? Shows full tag row - Has reliability data? Shows colored meter bars - No metrics? No empty bars, no 0% — just what's there Contract cards: urgency dot, progress bar, fill count. Workers inside: avatar initials, name, role, location, skill/cert tags (blue/green), archetype (purple), reliability/availability bars — all ONLY when data exists. GitHub-style dark theme. Call/SMS per worker. Search collapsed. ADR-021 compliant: works with a name and earns everything else. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 15:36:53 -05:00
root	13b01fee9f	ADR-021: Sparse data trust path — start with nothing, earn everything The staffing company said: 'we don't have any of that data.' They're right. We showed a demo with 18-field profiles and they have a name and a phone number. This ADR documents the trust path: Phase 1 (Day 1): Work with name + phone + role. That's enough. Phase 2 (Week 1-4): Timesheets → reliability. Calls → history. Phase 3 (Month 2+): AI starts helping with real earned data. Key principles: - Never show empty fields or 0% bars - Show what's THERE, not what's missing - Trust indicators: 'based on 3 placements' not just 'Reliability: 87%' - The system earns data by being useful, not by demanding it upfront Also created sparse_workers dataset (200 workers, 74% have role, 34% have notes, 5 have ONLY name+phone) for realistic testing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 15:32:06 -05:00
root	845acfdcda	Rich worker cards: skills, certs, reliability bars — not just names Each worker in a contract card now shows: - Initials avatar (color-coded) - Name + location on same line - Skill tags (blue pills, top 3 relevant) - Cert badges (green pills — OSHA, Forklift, Hazmat) - Archetype tag (purple — reliable, leader, etc) - Reliability bar with color (green >80%, yellow >50%, red <50%) - Availability bar with color - Individual Call/SMS buttons per worker Contract headers show: - Urgency dot (red/yellow/blue/green) - Client name, role × headcount, location, start time - Progress bar with fill count GitHub-style dark theme. Every piece of info visible at a glance without clicking anything. The staffer sees skills, certs, and reliability for every matched worker the moment the page loads. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 15:27:27 -05:00
root	05785b4628	Dashboard: the staffer's actual workday, not a search box Not a CRM search page. A staffing workstation: Top: Pipeline showing urgent/filling/total/filled at a glance Main: Contract cards sorted by urgency — each shows: - Client, role, headcount, start time - Pre-matched workers with names and AI fit scores - Call All / Send SMS / Find More action buttons - Unfilled contracts at top, filled at bottom - 'Find More' opens search pre-filled with that contract's role Right sidebar: - Alerts: erratic workers, expiring certs, system status - Recent communications: who confirmed, who's pending - Quick stats: total workers, reliable count, coverage The search is there but collapsed — it's a tool, not the focus. When they open the page, their day is already organized. This is what the CRM doesn't do: anticipate, pre-match, organize. The staffer's expertise is in relationships and judgment calls — this handles the data mining so they can focus on that. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 15:22:18 -05:00
root	7cb9999451	Rebuild search UI: zero dependencies, plain JS, DOM-only, works Replaced complex dashboard with minimal search.html: - No external JS/CSS files, no transpilation, no module imports - Plain JS with .then() chains (no async/await compat issues) - DOM-only rendering via createElement (no innerHTML with data) - 20s AbortController timeout so fetch never hangs - Detects /lakehouse/ proxy prefix automatically - 7KB total, loads in 18ms Calls lakehouse /vectors/hybrid directly — SQL filters always apply, works even when HNSW isn't loaded (brute-force fallback). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 13:26:27 -05:00
root	e7e988dcc0	Fix dashboard: always use hybrid (no HNSW dependency), 15s timeout, error display The search hung because pure AI mode calls HNSW which is RAM-only — gone after every lakehouse restart. Now ALL AI/hybrid searches go through the /search endpoint which uses brute-force when HNSW isn't loaded. Added 15s AbortController timeout so fetch never hangs. Added window.onerror handler to show JS errors on page. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 13:23:29 -05:00
root	5c93338f40	Fix: gateway defaulted to wrong vector index (10K instead of 50K) All gateway endpoints pointed to ethereal_workers_v1 (10K, W- prefix) instead of workers_500k_v1 (50K, W500K- prefix). Filters appeared broken because the vector results came from the wrong dataset — IDs matched numerically but belonged to different workers. Now: every search, match, and hybrid call uses workers_500k_v1. Verified: 'experienced welder' + state=OH + role=Welder returns 5 Welders in OH (Carmen Perry, Janet White, Rachel Miller, etc). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 13:16:11 -05:00
root	f9e2a0bbbe	Fix: filters now ALWAYS work — auto-switches to hybrid when set The bug: selecting a state filter in AI Search mode did nothing because HNSW vector search has no concept of SQL WHERE clauses. Results came back from any state. The fix: when ANY filter is set (state, role, or reliability > 0.5), the search automatically switches to hybrid mode which runs the SQL filter first, then AI-ranks within the filtered set. Users don't need to know about modes — filters just work. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 13:10:28 -05:00
root	6a2cc0fb8f	Search UI: type what you need, see real workers — no more taking my word for it Rebuilt the dashboard into a live search interface anyone can use: - Big search box: type in plain English, hit Enter or click Search - 3 modes: AI Search, CRM Keyword, Hybrid (best) - Clickable examples: 'warehouse help', 'dependable machine operator', etc - Filters: state, role, min reliability - Results show: name, role, location, skills, certs, reliability, AI match score - Hybrid results marked 'SQL verified against database' - CRM mode shows 0 results with a prompt to try AI Search - Mobile responsive This is the answer to 'we just have to take your word for it.' Type anything. See real workers. Compare CRM vs AI side by side. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 13:06:31 -05:00
root	48c7c1c5e6	Fix dashboard: detect /lakehouse/ nginx prefix for API calls dashboard.ts now checks if running behind the nginx proxy (path starts with /lakehouse) and prepends the prefix to all API calls. Without this, the browser called /sql instead of /lakehouse/sql and got 404s from the LLM Team Flask app. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 13:04:24 -05:00
root	7367e5f71d	Proof page: LIVE side-by-side CRM vs AI — shows, doesn't tell 3 live demo searches run on page load against 500K real profiles: 'warehouse help' — CRM: 0, AI: finds Forklift Ops + Loaders 'someone good with machines who is dependable' — CRM: 0, AI: finds Machine Ops 'safety trained worker for chemical plant' — CRM: 0, AI: finds OSHA+Hazmat workers Each shows the actual CRM keyword count (LIKE match) next to the AI vector results with real worker names, roles, and cities. Not described — demonstrated. The numbers come from queries that run when the page loads. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 12:55:11 -05:00
root	66a3460c92	Dashboard rebuilt: matches proof page design, mobile-ready Clean dark theme matching /proof page. Priority badges on contracts (urgent=red, high=yellow, medium=blue, low=green). Worker matches shown inline. Day tabs show fill counts. Alerts with icons. Playbook entries styled. All styles inline — no separate CSS file. Mobile responsive: single column layout, scrollable tabs. Links to /proof at bottom. https://devop.live/lakehouse/ — the dashboard https://devop.live/lakehouse/proof — the proof page Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 12:51:08 -05:00
root	5aaa3c5c08	Mobile responsive: proof page works on phones Added @media(max-width:768px) breakpoints: - 2-col grids → single column on mobile - 3-col grids → single column - 4-col model cards → 2-col - Stats grid → 2-col - Tables: horizontal scroll, smaller text - Reduced padding and font sizes - Hero title scales down Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 12:44:57 -05:00
root	bd8c30c7bd	Public URL: devop.live/lakehouse/proof — SSL, no IP needed Added nginx proxy: /lakehouse/* → localhost:3700 (agent gateway). Separate include file so the main llms3 config stays clean. https://devop.live/lakehouse/proof — styled proof page https://devop.live/lakehouse/proof.json — raw verification data https://devop.live/lakehouse/ — dashboard Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 12:41:53 -05:00
root	c53d3f4d14	Proof page: speaks to the staffer, not the engineer Rebuilt the page to address a staffing coordinator who's tired of learning new tools. Opens with "Your Morning Just Got Easier" and a side-by-side: their current 45-minute routine vs 5 minutes with pre-matched workers. Key messaging: - "This isn't another CRM to learn" - "We know what your day looks like" (checklist they'll recognize) - Shows real matched workers WITH names, not abstract metrics - "It understands what you mean" — warehouse help finds forklift ops - "It already filtered the junk" — only workers worth calling - "It runs on YOUR machine" — no cloud, no fees, no data leaving Technical proof pushed below a divider for the skeptical team. The staffer sees their contracts and their workers first. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 12:40:07 -05:00
root	dd344c9b38	Proof page: CRM vs AI side-by-side — shows what keywords can't do Rebuilt /proof to highlight the actual differentiator: - Section 01: "What a CRM Does" — SQL keyword search, every CRM has this - Section 02: "What AI + Vectors Do" — semantic understanding. Side-by-side: CRM finds 0 results for "warehouse work" because no profile contains that exact text. AI finds 5 verified workers because it understands Forklift Operator + Loader = warehouse work. - Section 03: 673K vectorized chunks, 98% recall, 10M at 5ms - Section 04: Local GPU, 4 models, no cloud, no API fees The point: this isn't another CRM search. It's an intelligence layer that understands MEANING — and it runs entirely on your hardware. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 12:27:46 -05:00
root	8d9c04a323	Proof page: styled HTML at /proof for team verification Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 12:23:04 -05:00
root	937569d188	ADR-020: Universal ID mapping — fix the flat embedding identity problem THE REAL PROBLEM: Every new data source produces different doc_id prefixes in vector indexes (W-, W500K-, W5K-, CAND-). Hybrid search had to hardcode strip_prefix for each one. New datasets broke hybrid until someone added another prefix. This violates "any data source without pre-defined schemas." THE FIX: IndexMeta.id_prefix — the catalog records what prefix each index uses. Hybrid search reads it and strips automatically. Legacy indexes fall back to heuristic stripping. New indexes can set id_prefix=None to use raw IDs (no prefix, no stripping needed). This means: ingest a new dataset, embed it, hybrid search works immediately without code changes. The system is truly source-agnostic. Also: full ADR document at docs/ADR-020-universal-id-mapping.md with the three options considered and rationale for the chosen approach. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 11:58:18 -05:00
root	1565f536eb	Fix: job tracker field name mismatch — the overnight killer ROOT CAUSE: Python scripts polled status.get("processed", 0) but the Rust Job struct serialized as "embedded_chunks". Scripts always saw 0, looped forever printing "unknown: 0/50000" for 8+ hours. Fix (both sides): - Rust: added "processed" alias field + "total" field to Job struct, kept in sync on every update_progress() and complete() call - Python: fixed autonomous_agent.py and overnight_proof.sh to read "embedded_chunks" as primary key The actual embedding pipeline was working the whole time — 673K real chunks embedded overnight. Only the monitoring was blind. One-word bug, 8 hours of zombie output. This is why you test the monitoring, not just the pipeline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 10:41:32 -05:00
root	0bd48771ff	OVERNIGHT PROOF: real embeddings confirm architecture 5,000 workers embedded through nomic-embed-text (real, not random). Results on REAL embeddings: HNSW recall@10: 1.0000 p50: 762us — PERFECT Lance recall@10: 0.9500 p50: 6.8ms — better than random vectors SQL autonomous: 50/50 (100%) Key finding: real embeddings IMPROVE Lance recall (0.95 vs 0.80 on random vectors) because real text embeddings cluster by topic, making IVF partitions more effective. The concern about degraded recall on real data was wrong — it's the opposite. Also discovered: the 50K embedding job DID complete (50K chunks in 234s) but the job progress tracker showed 0/0. The supervisor's progress reporting has a bug — the actual embedding pipeline works. Known remaining issue: hybrid search ID matching between workers_500k (worker_id format) and vector index (W5K-{id} format) needs the prefix stripping fix applied to the new index. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 01:32:12 -05:00
root	2e455919b7	Overnight proof — 5-step unattended test with real embeddings Runs autonomously via cron (every 3 min, state machine): 1. Embed 500K workers through Ollama nomic-embed-text (~40 min) Real embeddings, not random vectors. This is what matters. 2. Build HNSW + Lance IVF_PQ on real clustered data 3. Measure recall — HNSW vs Lance on real embeddings 4. 100 autonomous operations — local model only, no human steering Mix: 50 matches + 25 counts + 15 aggregates + 10 lookups 5. 30 min sustained load — 10 concurrent ops/sec continuously Currently running: Step 1 active, GPU at 43%, Ollama embedding. Monitor: tail -f /home/profit/lakehouse/logs/overnight_proof.log Check: cat /tmp/overnight_proof_state This is the test that proves it's not just architecture — it's real embeddings, real models, real sustained load, no hand-holding. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 01:22:07 -05:00
root	8b512d30e5	10M VECTOR SCALE TEST — PASSED THE PROOF: 10,000,000 × 768d vectors 30 GB Lance dataset on disk IVF_PQ index: 173 seconds to build (3162 partitions, 192 sub_vectors) Search p50: 5ms — at TEN MILLION vectors Search p95: 19ms HNSW at 10M would need 29 GB RAM = past the ceiling Lance at 10M = 30 GB disk, 5ms search, no RAM constraint Agent test on 500K workers: 22/22 positions filled (100%) Forklift Operator x5, Machine Operator x4, Welder x3, Loader x8, Quality Tech x2 — all via hybrid SQL+vector The architecture holds past the HNSW ceiling. Lance takes over exactly as ADR-019 designed. This is not theoretical anymore. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 01:16:59 -05:00
root	25e5685f44	10M vector scale test — cron heartbeat, runs while J sleeps 7-step autonomous test via cron (every 2 minutes): 1. Register 10M × 768d Parquet (28.8 GB, already generated) 2. Migrate Parquet → Lance (proves Lance handles what HNSW can't) 3. Build IVF_PQ (3162 partitions for √10M, 192 sub_vectors) 4. Search benchmark (10 searches, measure p50/p95) 5. Hot-swap profile test (create scale-10m profile, activate) 6. Agent test (5 contract matches on 500K via gateway, autonomous) 7. Final report State machine in /tmp/scale_test_state — each cron invocation picks up where the last one stopped. Lock file prevents concurrent runs. All output to /home/profit/lakehouse/logs/scale_test.log. Monitor: tail -f /home/profit/lakehouse/logs/scale_test.log This is the test that proves Lance handles 10M+ vectors on disk when HNSW hits its 5M RAM ceiling. No human intervention needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 01:06:38 -05:00
root	40305da654	500K scale test: 2.9M rows, sub-120ms SQL, architecture holds Bumped upload limit to 512MB for large CSV ingests. Generated and ingested 500K staffing worker profiles (346MB CSV → 75MB Parquet in 5.9s). SQL at 500K: COUNT=35ms, filter+state=67ms, aggregation=80ms, complex filter=117ms, 10 concurrent=84ms total (10/10 pass). HNSW memory projection: 500K vectors = 1.5GB RAM (comfortable on 128GB server). Ceiling at ~5M vectors (14.6GB) — Lance IVF_PQ takes over beyond that as designed in ADR-019. Hybrid search 500K SQL → 10K vector: 131ms with 6,289 SQL matches narrowed to 5 vector-ranked results. Total scale: 2.9M rows across all datasets (500K workers + 2.47M staffing data). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 01:00:21 -05:00
root	cd1fda3e21	Fix: CORS + relative URL + Langfuse tracing wired into gateway Three fixes: 1. CORS headers on all gateway responses (browser dashboard was blocked by same-origin policy) 2. Dashboard JS uses window.location.origin instead of hardcoded localhost:3700 (LAN browsers couldn't reach it) 3. Langfuse tracing wired into every gateway request — api() wrapper creates spans for each lakehouse call, logGeneration for LLM calls. Week simulation now produces 34 observations per run visible in Langfuse UI. 7 traces confirmed in Langfuse after restart. Every /sql, /search, /vram, /simulation call is tracked with timing + inputs + outputs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 00:53:18 -05:00
root	4a2bfce6e0	Week simulation + live dashboard + self-orientation + verification Week simulation engine: 5 business days, 4-8 contracts per day, 3 rotating staffers with handoffs between days. Runs hybrid search per contract via the gateway. 28 contracts, 108/108 filled (100%), 5 emergencies, 4 handoffs, 3.2s total. Dashboard at :3700/ — dark theme, shows: - Contract cards sorted by priority with match status - Day navigation across the work week - Week summary stats (fill rate, emergencies, handoffs) - Live alerts (erratic/silent workers) - Playbook entries - Real-time service health + VRAM Self-orientation (/context) + verification (/verify) endpoints so any agent can understand the system and fact-check claims without human intermediary. Accessible on LAN at http://192.168.1.177:3700 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 00:45:46 -05:00
root	a001a21902	MCP self-orientation: /context + /verify + architecture resources Any agent (Claude Code via MCP stdio, or sub-agents via HTTP :3700) can now self-orient without human explanation: GET /context returns: - System purpose and name - All datasets with row counts - All vector indexes with backends - Available models and their strengths - Complete tool list with rules - Current VRAM state POST /verify fact-checks any claim about a worker against the golden data. Agent says "worker 1313 is a Forklift Operator in IL with reliability 0.82" → endpoint returns verified=true/false with exact discrepancies. MCP resources (stdio path for Claude Code): - lakehouse://system — live system status - lakehouse://architecture — full PRD - lakehouse://instructions — agent operating manual - lakehouse://playbooks — successful operations database - lakehouse://datasets — dataset listing This is the "command and control" layer J asked for: any agent connecting to this system gets the context it needs to operate independently. No human intermediary required. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 00:41:46 -05:00
root	67ab6e4bac	Langfuse observability — every LLM call traced and scored Langfuse v2.95.11 running on :3001 (Docker + Postgres). Login: j@lakehouse.local / lakehouse2026 tracing.ts: startTrace → logGeneration/logRetrieval/logSpan → scoreTrace → flush. Every hybrid search, SQL generation, RAG pipeline, and co-pilot briefing gets a full trace: model, prompt, output, latency, tokens. The observer can now score traces based on verification results — Langfuse aggregates accuracy over time so we can see which models and approaches actually work in production, not just in tests. Services: lakehouse(:3100) + sidecar(:3200) + agent(:3700) + observer + langfuse(:3001) + minio(:9000) + mariadb(:3306) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 00:38:21 -05:00
root	fc6b01c2bf	Staffing Co-Pilot — the anticipation layer that changes everything 5-layer morning briefing system: 1. Contract scan: sorts by urgency, shows requirements 2. Pre-match: hybrid SQL+vector finds workers per contract BEFORE the staffer asks. 25/25 positions pre-matched (100%) 3. Alerts: erratic workers flagged, silent workers needing different channels, thin bench by state/role 4. Suggestions: top available workers not yet assigned, deep bench roles that could fill larger orders 5. Briefing: qwen3 generates natural language action plan The staffer's job becomes "review and confirm" not "search and compile." Action queue: 6 contracts ready for one-click outreach. Outputs structured JSON at /tmp/copilot_briefing.json — any UI (Dioxus, React, even a Telegram bot) can render this. This is the co-pilot: AI anticipates needs, surfaces answers, staffer focuses on relationships and judgment calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 00:19:07 -05:00
root	c7e6ab3beb	Staffing day simulation: 94% pass, all gates clear, ready for batching Multi-model validated simulation: 4 phases with validation gates. Morning (contract matching): 26/26 filled including 2 emergencies. Midday (intelligence): classified routing fixes the count/SQL gap — keyword classifier routes instantly, qwen2.5 generates SQL with few-shot examples showing exact column semantics. Afternoon (analytics): 5/5 SQL analytical queries. Key fix: few-shot SQL prompting. Adding 4 examples with correct column names (role, state, archetype) takes qwen2.5 from 40% to 80% accuracy on structured questions. The playbook logged this for future runs. Models: qwen3 (40K ctx, reasoning), qwen2.5 (fast SQL), nomic (embed). Query classifier is keyword-based — deterministic, instant, no LLM overhead for routing decisions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 00:14:34 -05:00
root	1bee0e4969	Qwen 3 integration + agent plan + playbook loop Pulled qwen3 (8.2B, 40K context, thinking, tool-calling). Created agent-qwen3 profile. Ran structured plan: 5 contracts (16/16 filled via hybrid), 5 intelligence questions (2/5 — same RAG counting gap). Key playbook entry generated: "count/aggregation questions must use /sql not /search. RAG returns 5 chunks from 10K — cannot count the full dataset." This routing rule is now in the playbooks database for future agent runs to learn from. Pattern confirmed across qwen2.5, mistral, AND qwen3: the structured matching path (hybrid SQL+vector) is production-ready across all models. The RAG counting gap is a routing problem, not a model problem — the fix is query classification, not a better model. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 00:08:48 -05:00

1 2 3 4

171 Commits