Adds GET /v1/health that returns a JSON snapshot of subsystem state
so operators (and load balancers, and the lakehouse-auditor
service) can verify the gateway is fully booted before routing
traffic. Phase 42-45 closures are now production-deployable; this
endpoint is the canary that proves it.
Returns 200 always — fields are observed-state, not pass/fail
gates. Monitoring tools evaluate the booleans + counts against
their own thresholds.
Shape:
{
"status": "ok",
"workers_loaded": bool,
"providers_configured": {
"ollama_cloud": bool, "openrouter": bool, "kimi": bool,
"opencode": bool, "gemini": bool, "claude": bool,
},
"langfuse_configured": bool,
"usage_total_requests": N,
"usage_by_provider": ["ollama_cloud", "openrouter", ...]
}
Verified live:
curl http://localhost:3100/v1/health
→ 4 providers configured (kimi, ollama_cloud, opencode, openrouter)
→ 2 not configured (claude, gemini — keys not wired)
→ langfuse_configured: true
→ workers_loaded: true (500K-row workers_500k.parquet snapshot)
Caveat: workers_loaded is a placeholder true — WorkerLookup trait
doesn't have a len() method yet, so we can't honestly report row
count from the runtime probe. The boot log line "loaded workers
parquet snapshot rows=N" is the source of truth on count. Future
follow-up: add `fn len(&self) -> usize` to WorkerLookup so /v1/health
can report the exact figure.
Pre-production checklist context: J flagged production switchover
incoming — synthetic profiles will be replaced with real Chicago
data soon. /v1/health gives the operator a single curl to verify
the gateway sees the new data after the parquet swap (boot log +
this endpoint).
Hot-swap reload (POST /v1/admin/reload-workers) deferred to a
follow-up — requires V1State.validate_workers to wrap in RwLock
or ArcSwap so write traffic doesn't block the steady-state
read path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>