Ollama Pro plan went live today (39-model fleet on the same
OLLAMA_CLOUD_KEY) and OpenCode Zen was already wired in the gateway
but not consumed. Routing every gpt-oss call site to faster /
stronger replacements:
| Site | gpt-oss → replacement | Why |
|---|---|---|
| ollama_cloud default | gpt-oss:120b → deepseek-v3.2 | newest DeepSeek revision; live-probed `pong` |
| openrouter default | openai/gpt-oss-120b:free → x-ai/grok-4.1-fast | already the scrum LADDER's PRIMARY |
| modes.toml staffing_inference | openai/gpt-oss-120b:free → kimi-k2.6 | coding-specialized, on Ollama Pro |
| modes.toml doc_drift_check | gpt-oss:120b → gemini-3-flash-preview | speed leader for factual checks |
| scrum_master_pipeline tree-split MAP+REDUCE | gpt-oss:120b → gemini-3-flash-preview | latency-dominated path (5-20× per file) |
| bot/propose.ts CLOUD_MODEL | gpt-oss:120b → deepseek-v3.2 | same Ollama key, faster |
| mcp-server/observer.ts overseer label fallback | gpt-oss:120b → claude-opus-4-7 | matches new overseer model |
| crates/gateway/src/execution_loop overseer escalation | ollama_cloud/gpt-oss:120b → opencode/claude-opus-4-7 | frontier reasoning matters here — fires only after local self-correct fails twice; Zen pay-per-token cost is bounded |
Verification:
- `cargo check -p gateway --tests` — clean
- Live probes through localhost:3100/v1/chat:
- `opencode/claude-opus-4-7` → "pong"
- `gemini-3-flash-preview` (ollama_cloud) → "pong"
- `kimi-k2.6` (ollama_cloud) → "pong"
- `deepseek-v3.2` (ollama_cloud) → "Pong! 🏓"
Notes:
- kimi-k2:1t still upstream-broken (HTTP 500 on Ollama Pro probe today,
matches yesterday's memory). Replacement table never picks it.
- The Rust changes need a `systemctl restart lakehouse.service` to
take effect on the running gateway. TS callers reload on next run.
- aibridge/src/context.rs still has gpt-oss:{20b,120b} in its window-
size lookup table; harmless and kept for callers that pass it
explicitly as an override.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…
…
…
…
Description
Rust-first object storage system
Languages
TypeScript
38.4%
Rust
35.8%
HTML
13.9%
Python
7.8%
Shell
2.1%
Other
2%