Wires opencode.ai as a /v1/chat provider. One sk-* key reaches 40
models across Anthropic, OpenAI, Google, Moonshot, DeepSeek, Zhipu,
Alibaba, Minimax — billed against either the user's Zen balance
(pay-per-token premium models) or Go subscription (flat-rate
Kimi/GLM/DeepSeek/etc.). The unified /zen/v1 endpoint routes both;
upstream picks the billing tier based on model id.
Notable adapter quirks:
- Strip "opencode/" prefix on outbound (mirrors openrouter/kimi
pattern). Caller can use {provider:"opencode", model:"X"} or
{model:"opencode/X"}.
- Drop temperature for claude-*, gpt-5*, o1/o3/o4 models. Anthropic
and OpenAI's reasoning lineage rejects temperature with 400
"deprecated for this model". OCChatBody now serializes temperature
as Option<f64> with skip_serializing_if so omitting it produces
clean JSON.
- max_tokens.filter(|&n| n > 0) catches Some(0) — defensive after
the same trap bit kimi.rs (empty env -> Number("") -> 0 -> 503).
- 600s default upstream timeout; reasoning models on big audit
prompts legitimately take 3-5 min. Override OPENCODE_TIMEOUT_SECS.
Key handling:
- /etc/lakehouse/opencode.env (0600 root) loaded via systemd
EnvironmentFile. Same pattern as kimi.env.
- OPENCODE_API_KEY env first, file scrape as fallback.
Verified end-to-end:
opencode/claude-opus-4-7 -> "I'm Claude, made by Anthropic."
opencode/kimi-k2.6 -> PONG-K26-GO
opencode/deepseek-v4-pro -> PONG-DS-V4
opencode/glm-5.1 -> PONG-GLM
opencode/minimax-m2.5-free -> PONG-FREE
Pricing reference (per audit @ ~14k in / 6k out):
claude-opus-4-7 ~$0.22 (Zen)
claude-haiku-4-5 ~$0.04 (Zen)
gpt-5.5-pro ~$1.50 (Zen)
gemini-3-flash ~$0.03 (Zen)
kimi-k2.6 / glm / deepseek / qwen / minimax / mimo: covered by Go
subscription ($10/mo, $60/mo cap).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
98 lines
4.0 KiB
TOML
98 lines
4.0 KiB
TOML
# Phase 39: Provider Registry
|
|
#
|
|
# Per-provider base_url, auth scheme, and default model. The gateway's
|
|
# /v1/chat dispatcher reads this file at boot to populate its provider
|
|
# table. Secrets (API keys) come from /etc/lakehouse/secrets.toml or
|
|
# environment variables — NEVER inline a key here.
|
|
#
|
|
# Adding a new provider:
|
|
# 1. New [[provider]] block with name, base_url, auth, default_model
|
|
# 2. Matching adapter at crates/aibridge/src/providers/<name>.rs
|
|
# implementing the ProviderAdapter trait (chat + embed + unload)
|
|
# 3. Route arm in crates/gateway/src/v1/mod.rs matching on `name`
|
|
# 4. Model-prefix routing hint in resolve_provider() if the provider
|
|
# uses an "<name>/..." model prefix (e.g. "openrouter/...")
|
|
|
|
[[provider]]
|
|
name = "ollama"
|
|
base_url = "http://localhost:3200"
|
|
auth = "none"
|
|
default_model = "qwen3.5:latest"
|
|
# Hot-path local inference. No bearer needed — Python sidecar on
|
|
# localhost handles the Ollama API. Model names are bare
|
|
# (e.g. "qwen3.5:latest", not "ollama/qwen3.5:latest").
|
|
|
|
[[provider]]
|
|
name = "ollama_cloud"
|
|
base_url = "https://ollama.com"
|
|
auth = "bearer"
|
|
auth_env = "OLLAMA_CLOUD_KEY"
|
|
default_model = "gpt-oss:120b"
|
|
# Cloud-tier Ollama. Key resolved from OLLAMA_CLOUD_KEY env at gateway
|
|
# boot. Model-prefix routing: "cloud/<model>" auto-routes here
|
|
# (see gateway::v1::resolve_provider).
|
|
|
|
[[provider]]
|
|
name = "openrouter"
|
|
base_url = "https://openrouter.ai/api/v1"
|
|
auth = "bearer"
|
|
auth_env = "OPENROUTER_API_KEY"
|
|
auth_fallback_files = ["/home/profit/.env", "/root/llm_team_config.json"]
|
|
default_model = "openai/gpt-oss-120b:free"
|
|
# Multi-provider gateway. Covers Anthropic, Google, OpenAI, MiniMax,
|
|
# Qwen, Gemma, etc. Key resolved via crates/gateway/src/v1/openrouter.rs
|
|
# resolve_openrouter_key() — env first, then fallback files.
|
|
# Model-prefix routing: "openrouter/<vendor>/<model>" auto-routes here,
|
|
# prefix stripped before upstream call.
|
|
|
|
[[provider]]
|
|
name = "opencode"
|
|
base_url = "https://opencode.ai/zen/v1"
|
|
# Unified endpoint — covers BOTH Zen (pay-per-token Anthropic/OpenAI/
|
|
# Gemini frontier) AND Go (flat-sub Kimi/GLM/DeepSeek/Qwen/Minimax).
|
|
# Upstream bills per-model: Zen models hit Zen balance, Go models hit
|
|
# Go subscription cap. /zen/go/v1 is the Go-only sub-path (rejects
|
|
# Zen models), kept for reference but not used by this provider.
|
|
auth = "bearer"
|
|
auth_env = "OPENCODE_API_KEY"
|
|
default_model = "claude-opus-4-7"
|
|
# OpenCode (Zen + GO unified endpoint). One sk-* key reaches Claude
|
|
# Opus 4.7, GPT-5.5-pro, Gemini 3.1-pro, Kimi K2.6, DeepSeek, GLM,
|
|
# Qwen, plus 4 free-tier models. OpenAI-compatible Chat Completions
|
|
# at /v1/chat/completions. Model-prefix routing: "opencode/<name>"
|
|
# auto-routes here, prefix stripped before upstream call.
|
|
# Key file: /etc/lakehouse/opencode.env (loaded via systemd EnvironmentFile).
|
|
# Model catalog: curl -H "Authorization: Bearer ..." https://opencode.ai/zen/v1/models
|
|
# Note: /zen/go/v1 is the GO-only sub-path (Kimi/GLM/DeepSeek tier);
|
|
# /zen/v1 covers everything including Anthropic (which /zen/go/v1 rejects).
|
|
|
|
[[provider]]
|
|
name = "kimi"
|
|
base_url = "https://api.kimi.com/coding/v1"
|
|
auth = "bearer"
|
|
auth_env = "KIMI_API_KEY"
|
|
default_model = "kimi-for-coding"
|
|
# Direct Kimi For Coding provider. `api.kimi.com` is a SEPARATE account
|
|
# system from `api.moonshot.ai` and `api.moonshot.cn` — keys are NOT
|
|
# interchangeable. Used when Ollama Cloud's `kimi-k2:1t` is upstream-
|
|
# broken and OpenRouter's `moonshotai/kimi-k2.6` is rate-limited.
|
|
# Model id: `kimi-for-coding` (kimi-k2.6 underneath).
|
|
# Key file: /etc/lakehouse/kimi.env (loaded via systemd EnvironmentFile).
|
|
# Model-prefix routing: "kimi/<model>" auto-routes here, prefix stripped.
|
|
|
|
# Planned (Phase 40 long-horizon — adapters not yet shipped):
|
|
#
|
|
# [[provider]]
|
|
# name = "gemini"
|
|
# base_url = "https://generativelanguage.googleapis.com/v1beta"
|
|
# auth = "api_key_query"
|
|
# auth_env = "GEMINI_API_KEY"
|
|
# default_model = "gemini-2.0-flash"
|
|
#
|
|
# [[provider]]
|
|
# name = "claude"
|
|
# base_url = "https://api.anthropic.com/v1"
|
|
# auth = "x_api_key"
|
|
# auth_env = "ANTHROPIC_API_KEY"
|
|
# default_model = "claude-3-5-sonnet-latest"
|