new cmd/chatd on :3220 routes /v1/chat to the right provider based
on model-name prefix or :cloud suffix. closes the architectural gap
named in lakehouse.toml [models]: tiers map to model IDs, but until
phase 4 there was no service that could actually CALL those models
from go.
routing rules (registry.Resolve):
ollama/<m> → local Ollama (prefix stripped)
ollama_cloud/<m> → Ollama Cloud
<m>:cloud → Ollama Cloud (suffix variant — kimi-k2.6:cloud)
openrouter/<v>/<m> → OpenRouter (prefix stripped, OpenAI-compat)
opencode/<m> → OpenCode unified Zen+Go
kimi/<m> → Kimi For Coding (api.kimi.com/coding/v1)
bare names → local Ollama (default)
provider implementations:
- internal/chat/types.go Provider interface, Request/Response, errors
- internal/chat/registry.go prefix + :cloud suffix dispatch
- internal/chat/ollama.go local Ollama via /api/chat (think=false default)
- internal/chat/ollama_cloud.go Ollama Cloud via /api/generate (Bearer auth)
- internal/chat/openai_compat.go shared OpenAI Chat Completions for the
OpenRouter/OpenCode/Kimi family
- internal/chat/builder.go BuildRegistry from BuilderInput;
ResolveKey reads env then .env file fallback
config:
- ChatdConfig in internal/shared/config.go with bind, ollama_url,
per-provider key env names + .env fallback paths, timeout
- Gateway gains chatd_url + /v1/chat + /v1/chat/* routes
- lakehouse.toml [chatd] block with /etc/lakehouse/<provider>.env defaults
tests (19 in internal/chat):
- registry: prefix + :cloud + errors + telemetry + provider listing
- ollama: happy path + prefix strip + format=json + 500 mapping +
flatten_messages
- openai_compat: happy path + format=json + 429 mapping + zero-choices
think=false default in ollama + ollama_cloud — local hot path skips
reasoning, low-budget callers (the playbook_lift judge at max_tokens=10)
get direct answers instead of empty content + done_reason=length.
proven via chatd_smoke acceptance.
acceptance gate: scripts/chatd_smoke.sh — 6/6 PASS:
1. /v1/chat/providers lists exactly registered providers (1 in dev mode)
2. bare model → ollama default with content + token counts + latency
3. explicit ollama/<m> → prefix stripped at upstream
4. <m>:cloud without ollama_cloud registered → 404 (no silent fall-through)
5. unknown/<m> → falls through to default → upstream 502 (no prefix rewrite)
6. missing model field → 400
just verify: PASS (vet + 30 packages × short tests + 9 smokes).
chatd_smoke is a domain smoke (not in just verify, mirrors matrix /
observer / pathway pattern).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
106 lines
3.2 KiB
Go
106 lines
3.2 KiB
Go
package chat
|
|
|
|
import (
|
|
"bufio"
|
|
"log/slog"
|
|
"os"
|
|
"strings"
|
|
"time"
|
|
)
|
|
|
|
// BuilderInput drives provider construction. Each field maps to one
|
|
// provider; empty fields mean "skip" (the registry won't have that
|
|
// provider — :cloud suffix or openrouter/* prefixes will 404 cleanly).
|
|
type BuilderInput struct {
|
|
OllamaURL string // local Ollama, no auth (typically http://localhost:11434)
|
|
OllamaCloudKey string // OLLAMA_CLOUD_KEY
|
|
OpenRouterKey string // OPENROUTER_API_KEY
|
|
OpenCodeKey string // OPENCODE_API_KEY
|
|
KimiKey string // KIMI_API_KEY
|
|
Timeout time.Duration // default 180s
|
|
}
|
|
|
|
// BuildRegistry constructs a Registry from the input. Logs which
|
|
// providers were registered (for operator confidence at boot).
|
|
func BuildRegistry(in BuilderInput) *Registry {
|
|
if in.Timeout == 0 {
|
|
in.Timeout = 180 * time.Second
|
|
}
|
|
|
|
var providers []Provider
|
|
registered := []string{}
|
|
|
|
// Local Ollama always registered if URL given (no auth needed).
|
|
if in.OllamaURL != "" {
|
|
providers = append(providers, NewOllama(in.OllamaURL, in.Timeout))
|
|
registered = append(registered, "ollama")
|
|
}
|
|
if in.OllamaCloudKey != "" {
|
|
providers = append(providers, NewOllamaCloud(in.OllamaCloudKey, in.Timeout))
|
|
registered = append(registered, "ollama_cloud")
|
|
}
|
|
if in.OpenRouterKey != "" {
|
|
providers = append(providers, NewOpenRouter(in.OpenRouterKey, in.Timeout))
|
|
registered = append(registered, "openrouter")
|
|
}
|
|
if in.OpenCodeKey != "" {
|
|
providers = append(providers, NewOpenCode(in.OpenCodeKey, in.Timeout))
|
|
registered = append(registered, "opencode")
|
|
}
|
|
if in.KimiKey != "" {
|
|
providers = append(providers, NewKimi(in.KimiKey, in.Timeout))
|
|
registered = append(registered, "kimi")
|
|
}
|
|
|
|
r := NewRegistry(providers...)
|
|
slog.Info("chat registry built", "providers", registered)
|
|
return r
|
|
}
|
|
|
|
// ResolveKey reads an API key with the priority chain:
|
|
// 1. Explicit env var (named by envVar)
|
|
// 2. .env file at filePath (e.g. /etc/lakehouse/openrouter.env)
|
|
// with KEY=value lines; the first matching line wins.
|
|
// 3. "" if neither set
|
|
//
|
|
// Mirrors the Rust adapter's resolve_*_key() pattern. Empty key
|
|
// means the provider stays unregistered — operators see one fewer
|
|
// entry in the boot log instead of a 503 at first request.
|
|
func ResolveKey(envVar, envFileName, envFilePath string) string {
|
|
if envVar != "" {
|
|
if v := strings.TrimSpace(os.Getenv(envVar)); v != "" {
|
|
return v
|
|
}
|
|
}
|
|
if envFilePath != "" {
|
|
if v := readEnvFileVar(envFilePath, envFileName); v != "" {
|
|
return v
|
|
}
|
|
}
|
|
return ""
|
|
}
|
|
|
|
// readEnvFileVar reads a KEY=value style env file and returns the
|
|
// value of `name`. Returns "" on any error or missing key. Stops at
|
|
// first match. No quoting/escaping — same simple shape that systemd
|
|
// EnvironmentFile= reads.
|
|
func readEnvFileVar(path, name string) string {
|
|
f, err := os.Open(path)
|
|
if err != nil {
|
|
return ""
|
|
}
|
|
defer f.Close()
|
|
scanner := bufio.NewScanner(f)
|
|
prefix := name + "="
|
|
for scanner.Scan() {
|
|
line := strings.TrimSpace(scanner.Text())
|
|
if line == "" || strings.HasPrefix(line, "#") {
|
|
continue
|
|
}
|
|
if strings.HasPrefix(line, prefix) {
|
|
return strings.Trim(strings.TrimPrefix(line, prefix), `"'`)
|
|
}
|
|
}
|
|
return ""
|
|
}
|