Langfuse v2.95.11 running on :3001 (Docker + Postgres).
Login: j@lakehouse.local / lakehouse2026
tracing.ts: startTrace → logGeneration/logRetrieval/logSpan → scoreTrace → flush.
Every hybrid search, SQL generation, RAG pipeline, and co-pilot
briefing gets a full trace: model, prompt, output, latency, tokens.
The observer can now score traces based on verification results —
Langfuse aggregates accuracy over time so we can see which models
and approaches actually work in production, not just in tests.
Services: lakehouse(:3100) + sidecar(:3200) + agent(:3700) +
observer + langfuse(:3001) + minio(:9000) + mariadb(:3306)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>