v1: fire observer /event from /v1/chat alongside Langfuse trace

Observer at :3800 already collects scrum + scenario events into a ring buffer that pathway-memory + KB consolidation read from. /v1/chat now posts a lightweight {endpoint, source:"v1.chat", input_summary, output_summary, success, duration_ms} event there too — fire-and-forget tokio::spawn, observer-down doesn't block the chat response. Now any tool routed through our gateway (Pi CLI, Archon, openai SDK clients, langchain-js) shows up in the same ring buffer the scrum loop reads, ready for the same KB-consolidation analysis. Independent of the existing langfuse-bridge that polls Langfuse — this path is immediate. Verified: GET /stats shows {by_source: {v1.chat: N}} grows by 1 per chat call, both for direct curl and for Pi CLI invocations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 18:01:52 -05:00 · 2026-04-26 18:01:52 -05:00 · d1d97a045b
commit d1d97a045b
parent 540a9a27ee
1 changed files with 40 additions and 0 deletions
--- a/crates/gateway/src/v1/mod.rs
+++ b/crates/gateway/src/v1/mod.rs
@ -438,6 +438,46 @@ async fn chat(
        });
    }

+    // Phase 40 part 2 — fire-and-forget /event to observer at :3800.
+    // Same ring-buffer that scrum + scenario events land in, so any
+    // tool-routed-through-our-gateway (Pi, Archon, openai SDK clients)
+    // shows up alongside scrum_master events for KB consolidation +
+    // pathway-memory + bug-fingerprint compounding. Best-effort:
+    // observer being down doesn't block the chat response.
+    {
+        let provider = used_provider.clone();
+        let model = resp.model.clone();
+        let prompt_tokens = resp.usage.prompt_tokens;
+        let completion_tokens = resp.usage.completion_tokens;
+        let success = true;
+        tokio::spawn(async move {
+            let body = serde_json::json!({
+                "endpoint": "/v1/chat",
+                "source": "v1.chat",
+                "event_kind": "chat_completion",
+                "input_summary": format!(
+                    "{} {} prompt={}t",
+                    provider, model, prompt_tokens
+                ),
+                "output_summary": format!(
+                    "completion={}t {}ms",
+                    completion_tokens, latency_ms
+                ),
+                "success": success,
+                "duration_ms": latency_ms,
+            });
+            let client = reqwest::Client::builder()
+                .timeout(std::time::Duration::from_secs(2))
+                .build()
+                .unwrap_or_else(|_| reqwest::Client::new());
+            let _ = client
+                .post("http://localhost:3800/event")
+                .json(&body)
+                .send()
+                .await;
+        });
+    }
+
    // Phase 40: per-provider usage tracking
    {
        let mut u = state.usage.write().await;