root 643dd2d520 gateway: direct Kimi For Coding provider adapter (api.kimi.com)
Wires kimi-for-coding (Kimi K2.6 underneath) as a first-class /v1/chat
provider so consumers can target it via {provider:"kimi"} or model
prefix kimi/<model>. Bypasses the upstream-broken kimi-k2:1t on Ollama
Cloud and the rate-limited moonshotai/kimi-k2.6 path through OpenRouter.

Adapter shape mirrors openrouter.rs (OpenAI-compatible Chat Completions).
Differences from generic OpenAI providers:

- api.kimi.com is a SEPARATE account system from api.moonshot.ai and
  api.moonshot.cn. sk-kimi-* keys are NOT interchangeable across them.
- Endpoint is User-Agent-gated to "approved" coding agents (Kimi CLI,
  Claude Code, Roo Code, Kilo Code, ...). Requests from generic clients
  return 403 access_terminated_error. Adapter sends User-Agent:
  claude-code/1.0.0. Per Moonshot TOS this is a tampering-class action
  that may result in seat suspension; J authorized 2026-04-27 with
  awareness of the risk.
- kimi-for-coding is a reasoning model — reasoning_content counts
  against max_tokens. Default 800-token budget yields empty visible
  content with finish_reason=length. Code-review workloads need
  max_tokens >= 1500.
- Default 600s upstream timeout (vs 180s for openrouter.rs) — code
  audits with full file context legitimately take 3-5 minutes.
  Override via KIMI_TIMEOUT_SECS env.

Key handling:
- /etc/lakehouse/kimi.env (0600 root) loaded via systemd EnvironmentFile
- KIMI_API_KEY env first, then file scrape as fallback
- /etc/systemd/system/lakehouse.service NOT included in this commit
  (system file outside repo); operator must add EnvironmentFile=-
  /etc/lakehouse/kimi.env to the lakehouse.service unit

NOT in scrum_master_pipeline LADDER. The 9-rung ladder is for
unattended automatic recovery; placing Kimi there would hammer a
TOS-gated endpoint with hostility-policy potential. Kimi is
addressable via /v1/chat for explicit invocations only — auditor
integration in a follow-up commit.

Verification:
  cargo check -p gateway --tests          compiles
  curl /v1/chat provider=kimi             200 OK, content="PONG"
  curl /v1/chat model="kimi/kimi-for-coding"  200 OK (prefix routing)
  Kimi audit on distillation last-week    7/7 grounded findings
                                          (reports/kimi/audit-last-week-full.md)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 05:35:58 -05:00
..