Phase 40 PRD (docs/CONTROL_PLANE_PRD.md:91) claimed:
"Gitea MCP reconnect — the MCP server binary still installed at
/home/profit/.bun/install/cache/gitea-mcp@0.0.10/ gets wired into
mcp-server/index.ts tool registry."
The PHASES.md checkbox marked this done, but audit found:
- gitea-mcp binary exists in bun cache (verified)
- Zero references to gitea/list_prs/open_pr in mcp-server/index.ts
- No entry for "gitea" in .mcp.json
The PRD's architectural description ("wired into mcp-server/index.ts
tool registry") is conceptually wrong — gitea-mcp is a peer MCP server
that the MCP host (Claude Code) connects to directly, not a library
to import. Correct wiring: register it in .mcp.json so Claude Code
spawns both lakehouse's MCP server AND gitea-mcp as separate children,
each exposing their own tools.
This commit adds the "gitea" entry to .mcp.json pointing at bunx
gitea-mcp with GITEA_HOST set to git.agentview.dev.
OPERATOR STEP (one-time): before restarting Claude Code, generate a
personal access token at https://git.agentview.dev/user/settings/
applications and replace the SET_ME_... placeholder in
GITEA_ACCESS_TOKEN. Token needs at minimum `read:repository,
write:issue, read:user` scopes for list_prs/open_pr/comment_on_issue.
Still open from Phase 40 (not in this commit, bigger scope):
- crates/aibridge/src/providers/gemini.rs (claimed, missing)
- crates/aibridge/src/providers/claude.rs (claimed, missing)
These are ~100-200 lines each (full HTTP adapter + auth + request
shape mapping). Flag as follow-up commits.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Any agent (Claude Code via MCP stdio, or sub-agents via HTTP :3700)
can now self-orient without human explanation:
GET /context returns:
- System purpose and name
- All datasets with row counts
- All vector indexes with backends
- Available models and their strengths
- Complete tool list with rules
- Current VRAM state
POST /verify fact-checks any claim about a worker against the golden
data. Agent says "worker 1313 is a Forklift Operator in IL with
reliability 0.82" → endpoint returns verified=true/false with exact
discrepancies.
MCP resources (stdio path for Claude Code):
- lakehouse://system — live system status
- lakehouse://architecture — full PRD
- lakehouse://instructions — agent operating manual
- lakehouse://playbooks — successful operations database
- lakehouse://datasets — dataset listing
This is the "command and control" layer J asked for: any agent
connecting to this system gets the context it needs to operate
independently. No human intermediary required.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
MCP server at mcp-server/index.ts — 9 tools exposing the full
lakehouse to any MCP-compatible model:
search_workers (hybrid SQL+vector), query_sql, match_contract,
get_worker, rag_question, log_success, get_playbooks,
swap_profile, vram_status
The "successful playbooks" pattern: log_success writes outcomes
back to the lakehouse as a queryable dataset. Small models call
get_playbooks to learn what approaches worked for similar tasks —
no retraining needed, just data.
generate_workers.py scales to 100K+ with realistic distributions:
- 20 roles weighted by staffing industry frequency
- 44 real Midwest/South cities across 12 states
- Per-role skill pools (warehouse/production/machine/maintenance)
- 13 certification types with realistic probability
- 8 behavioral archetypes with score distributions
- SMS communication templates (20 patterns)
100K worker dataset ingested: 70MB CSV → Parquet in 1.1s. Verified:
11K forklift ops, 27K in IL, archetype distribution matches weights.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>