lakehouse

Go to file

root 021c1b557f agent.ts: route generateCloud through /v1/chat (Phase 44 migration)

Phase 44 PRD (docs/CONTROL_PLANE_PRD.md:204) explicitly lists
`tests/multi-agent/agent.ts::generate()` as a migration target:
every internal LLM caller must flow through /v1/chat so usage
accounting + audit trail see all traffic.

generateCloud() was bypassing the gateway entirely — direct POST to
OLLAMA_CLOUD_URL/api/generate with the bearer key. This meant:
  - /v1/usage missed every agent.ts cloud call
  - No gateway-side caching, rate-limiting, or cost gating
  - Callers needed OLLAMA_CLOUD_KEY in env (leak risk; gateway
    already owns the key)

Migration:
  - Endpoint: OLLAMA_CLOUD_URL/api/generate → GATEWAY/v1/chat
  - Body shape: {prompt,options.num_predict,options.temperature} →
    OpenAI-compatible {messages[],temperature,max_tokens}
  - provider: "ollama_cloud" explicit in the request
  - Response extraction: data.response → data.choices[0].message.content
  - OLLAMA_CLOUD_KEY no longer required in agent.ts env

Phase 44 gate verified: `grep localhost:3200/generate|/api/generate`
now only hits (a) the ollama_cloud.rs adapter itself (legit — it's
the gateway-side direct caller) and (b) this comment explaining the
migration history. Zero non-adapter code paths to /api/generate.

generate() (local Ollama) still goes direct to :3200 — that's the
t1_hot path. Phase 44 PRD focuses on cloud callers; hot-path local
generation deliberately stays direct for latency.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-24 13:27:54 -05:00

auditor

Audit pipeline PR #9 : determinism + fact extraction + verifier gate + KB stats + context injection (PR #9 )

2026-04-23 05:29:38 +00:00

bot

Control-plane pivot: Phase 38-44 plan + bot scaffold

2026-04-22 02:43:31 -05:00

config

Phase 40: Routing Engine + Policy

2026-04-23 02:36:45 -05:00

crates

truth: split staffing + devops into dedicated modules (Phase 42 PRD)