1 Commits

Author SHA1 Message Date
root
e1d48d3c8f MCP server (Bun) + 100K worker generator + lakehouse integration
MCP server at mcp-server/index.ts — 9 tools exposing the full
lakehouse to any MCP-compatible model:
  search_workers (hybrid SQL+vector), query_sql, match_contract,
  get_worker, rag_question, log_success, get_playbooks,
  swap_profile, vram_status

The "successful playbooks" pattern: log_success writes outcomes
back to the lakehouse as a queryable dataset. Small models call
get_playbooks to learn what approaches worked for similar tasks —
no retraining needed, just data.

generate_workers.py scales to 100K+ with realistic distributions:
  - 20 roles weighted by staffing industry frequency
  - 44 real Midwest/South cities across 12 states
  - Per-role skill pools (warehouse/production/machine/maintenance)
  - 13 certification types with realistic probability
  - 8 behavioral archetypes with score distributions
  - SMS communication templates (20 patterns)

100K worker dataset ingested: 70MB CSV → Parquet in 1.1s. Verified:
11K forklift ops, 27K in IL, archetype distribution matches weights.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 23:54:33 -05:00