Phase 40 scope: Langfuse + Gitea MCP recovery as named deliverables

J flagged that a prior version of this stack had Langfuse traces
piping into the observer + Gitea MCP for repo ops — lost. Adding
these as explicit Phase 40 deliverables alongside routing engine
+ Gemini/Claude adapters.

Findings during scope-check:
- Langfuse container is already running (Up 2 days, langfuse:2,
  localhost:3001 healthcheck passes)
- mcp-server/tracing.ts + package.json already have SDK wired
- Credentials pk-lf-staffing / sk-lf-staffing-secret (from env)
- Gitea MCP binary still installed at gitea-mcp@0.0.10

So recovery here is mostly re-connecting existing infra:
1. Add Rust-side Langfuse client for /v1/chat tracing (gateway
   currently bypasses tracing, mcp-server already has it)
2. Wire Langfuse → observer :3800 pipe
3. Register Gitea MCP in mcp-server/index.ts tool list

Each landing as part of Phase 40 when the routing engine ships.
This commit is contained in:
profit 2026-04-22 03:01:28 -05:00
parent 42a11d35cd
commit 6316433062

View File

@ -72,11 +72,11 @@ Ship each phase before starting the next. Each ends with green tests + docs upda
---
## Phase 40 — Routing & Policy Engine
## Phase 40 — Routing & Policy Engine + Observability Recovery
**Goal:** Replace hardcoded T1-T5 routing with a rules engine. Add Gemini + Claude adapters. Cost gating enforced at router level.
**Goal:** Replace hardcoded T1-T5 routing with a rules engine. Add Gemini + Claude adapters. Cost gating enforced at router level. **Reinstate Langfuse + Gitea MCP** — recovery of the observability + repo-ops stack J built previously (see `project_lost_stack` memory).
**Ships:**
**Ships — routing:**
- `crates/aibridge/src/routing.rs` — rules engine (match on: task type, token budget, previous attempt failures, profile ID)
- `config/routing.toml` — rules in TOML (human-editable, hot-reloadable)
- `crates/aibridge/src/providers/gemini.rs``generativelanguage.googleapis.com` adapter
@ -84,15 +84,24 @@ Ship each phase before starting the next. Each ends with green tests + docs upda
- Fallback chain support: if primary returns 5xx or times out, try next in chain
- Cost gate: per-request budget + daily budget per-provider
**Ships — observability (was lost, now restored):**
- **Langfuse** self-hosted via Docker Compose. Single source of truth for every LLM call trace: prompt / response / tokens / cost / latency / provider / fallback chain / profile used. UI at `localhost:3000`. Keys in `/etc/lakehouse/secrets.toml`.
- `crates/aibridge/src/langfuse.rs` — thin fire-and-forget trace emitter. Every `/v1/chat` call spawns a background task that POSTs to `langfuse/api/public/ingestion`. Non-blocking: trace failures never affect response.
- **Langfuse → observer pipe**`mcp-server/langfuse_bridge.ts` or similar. Polls Langfuse's trace API at interval, forwards completed traces to observer `:3800/event` with `source: "langfuse"`. KB now sees cost/latency deltas per model, not just outcome deltas.
- **Gitea MCP reconnect** — the MCP server binary still installed at `/home/profit/.bun/install/cache/gitea-mcp@0.0.10/` gets wired into `mcp-server/index.ts` tool registry. Agents can open PRs, comment on issues, list commits via named tools. Closes Phase 28's repo-ops gap.
**Gate:**
- Rule like "local models for simple JSON emitters, cloud for reasoning" fires correctly by task type
- Primary fails → fallback provider hits, response still matches `/v1/chat` shape
- Daily budget hit → subsequent requests return 429 with clear retry-at header
- `/v1/usage` reports per-provider breakdown
- **Every `/v1/chat` call appears in Langfuse UI** with correct prompt, response, latency, token count within 2 seconds of the request completing
- **Langfuse → observer pipe** delivers trace deltas to KB: `GET :3800/stats?source=langfuse` shows non-zero count after a few scenarios run
- **Gitea MCP tools callable**`list_prs`, `open_pr`, `comment_on_issue` exposed in `mcp-server/index.ts`, verifiable via a quick agent scenario
**Non-goals:** Retrieval Profile split (Phase 41), Truth Layer (Phase 42).
**Non-goals:** Retrieval Profile split (Phase 41), Truth Layer (Phase 42). Langfuse self-hosted UI customization / SSO.
**Risk:** Medium. Multi-provider auth + cost tracking is cross-cutting. Mitigation: every provider call wrapped in a single `dispatch()` function, all observability flows through there.
**Risk:** Medium. Multi-provider auth + cost tracking is cross-cutting; Langfuse adds 4-5 Docker containers (PostgreSQL, ClickHouse, Redis, web, worker). Mitigation: every provider call wrapped in a single `dispatch()` function so observability flows through one point; Langfuse Docker Compose is their supported deployment path, well-tested.
---