root 021c1b557f agent.ts: route generateCloud through /v1/chat (Phase 44 migration)
Phase 44 PRD (docs/CONTROL_PLANE_PRD.md:204) explicitly lists
`tests/multi-agent/agent.ts::generate()` as a migration target:
every internal LLM caller must flow through /v1/chat so usage
accounting + audit trail see all traffic.

generateCloud() was bypassing the gateway entirely — direct POST to
OLLAMA_CLOUD_URL/api/generate with the bearer key. This meant:
  - /v1/usage missed every agent.ts cloud call
  - No gateway-side caching, rate-limiting, or cost gating
  - Callers needed OLLAMA_CLOUD_KEY in env (leak risk; gateway
    already owns the key)

Migration:
  - Endpoint: OLLAMA_CLOUD_URL/api/generate → GATEWAY/v1/chat
  - Body shape: {prompt,options.num_predict,options.temperature} →
    OpenAI-compatible {messages[],temperature,max_tokens}
  - provider: "ollama_cloud" explicit in the request
  - Response extraction: data.response → data.choices[0].message.content
  - OLLAMA_CLOUD_KEY no longer required in agent.ts env

Phase 44 gate verified: `grep localhost:3200/generate|/api/generate`
now only hits (a) the ollama_cloud.rs adapter itself (legit — it's
the gateway-side direct caller) and (b) this comment explaining the
migration history. Zero non-adapter code paths to /api/generate.

generate() (local Ollama) still goes direct to :3200 — that's the
t1_hot path. Phase 44 PRD focuses on cloud callers; hot-path local
generation deliberately stays direct for latency.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:27:54 -05:00
2026-04-23 02:36:45 -05:00
2026-04-22 02:41:15 -05:00
2026-03-27 05:54:09 -05:00
2026-04-22 02:41:15 -05:00
Description
Rust-first object storage system
6.3 GiB
Languages
TypeScript 38.4%
Rust 35.8%
HTML 13.9%
Python 7.8%
Shell 2.1%
Other 2%