lakehouse/config/providers.toml
root 2f1b9c9768 phase-39+41: land promised artifacts — providers.toml, activation.rs, profiles/
Three PRD gaps closed in one coherent batch — all were cosmetic or
scaffold-shaped, now real files:

Phase 39 (PRD:57):
  + config/providers.toml — provider registry (name/base_url/auth/
    default_model) for ollama, ollama_cloud, openrouter. Commented
    stubs for gemini + claude pending adapter work. Secrets stay in
    /etc/lakehouse/secrets.toml or env, NEVER inline.

Phase 41 (PRD:115):
  + crates/vectord/src/activation.rs — ActivationTracker with the
    PRD-named single-flight guard ("refuse new activation if one is
    pending/running"). Per-profile granularity — activating A doesn't
    block B. 5 tests cover the full state machine. Handler body stays
    in service.rs for now; tracker usage integration is a follow-up.

Phase 41 (PRD:113):
  + crates/shared/src/profiles/ with 4 submodules:
      * execution.rs — `pub use crate::types::ModelProfile as
        ExecutionProfile` (backward-compat rename per PRD)
      * retrieval.rs — top_k, rerank_top_k, freshness cutoff,
        playbook boost, sensitivity-gate enforcement
      * memory.rs — playbook boost ceiling, history cap, doc
        staleness, auto-retire-on-failure
      * observer.rs — failure cluster size, alert cooldown, ring
        size, langfuse forwarding
    All fields `#[serde(default)]` so existing ModelProfile files
    load unchanged.

Still open from the same phases:
  - Gemini + Claude provider adapters (Phase 40 — 100-200 LOC each)
  - Full activate_profile handler extraction into activation.rs
    (Phase 41 — module-structure refactor)
  - Catalogd CRUD endpoints for retrieval/memory/observer profiles
    (Phase 41 — exists at list level, no create/update/delete yet)
  - truth/ repo-root directory for file-backed rules (Phase 42 —
    TOML loader + schema)
  - crates/validator crate (Phase 43 — full greenfield)

Workspace warnings still at 0. 5 new tests, all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:32:40 -05:00

63 lines
2.3 KiB
TOML

# Phase 39: Provider Registry
#
# Per-provider base_url, auth scheme, and default model. The gateway's
# /v1/chat dispatcher reads this file at boot to populate its provider
# table. Secrets (API keys) come from /etc/lakehouse/secrets.toml or
# environment variables — NEVER inline a key here.
#
# Adding a new provider:
# 1. New [[provider]] block with name, base_url, auth, default_model
# 2. Matching adapter at crates/aibridge/src/providers/<name>.rs
# implementing the ProviderAdapter trait (chat + embed + unload)
# 3. Route arm in crates/gateway/src/v1/mod.rs matching on `name`
# 4. Model-prefix routing hint in resolve_provider() if the provider
# uses an "<name>/..." model prefix (e.g. "openrouter/...")
[[provider]]
name = "ollama"
base_url = "http://localhost:3200"
auth = "none"
default_model = "qwen3.5:latest"
# Hot-path local inference. No bearer needed — Python sidecar on
# localhost handles the Ollama API. Model names are bare
# (e.g. "qwen3.5:latest", not "ollama/qwen3.5:latest").
[[provider]]
name = "ollama_cloud"
base_url = "https://ollama.com"
auth = "bearer"
auth_env = "OLLAMA_CLOUD_KEY"
default_model = "gpt-oss:120b"
# Cloud-tier Ollama. Key resolved from OLLAMA_CLOUD_KEY env at gateway
# boot. Model-prefix routing: "cloud/<model>" auto-routes here
# (see gateway::v1::resolve_provider).
[[provider]]
name = "openrouter"
base_url = "https://openrouter.ai/api/v1"
auth = "bearer"
auth_env = "OPENROUTER_API_KEY"
auth_fallback_files = ["/home/profit/.env", "/root/llm_team_config.json"]
default_model = "openai/gpt-oss-120b:free"
# Multi-provider gateway. Covers Anthropic, Google, OpenAI, MiniMax,
# Qwen, Gemma, etc. Key resolved via crates/gateway/src/v1/openrouter.rs
# resolve_openrouter_key() — env first, then fallback files.
# Model-prefix routing: "openrouter/<vendor>/<model>" auto-routes here,
# prefix stripped before upstream call.
# Planned (Phase 40 long-horizon — adapters not yet shipped):
#
# [[provider]]
# name = "gemini"
# base_url = "https://generativelanguage.googleapis.com/v1beta"
# auth = "api_key_query"
# auth_env = "GEMINI_API_KEY"
# default_model = "gemini-2.0-flash"
#
# [[provider]]
# name = "claude"
# base_url = "https://api.anthropic.com/v1"
# auth = "x_api_key"
# auth_env = "ANTHROPIC_API_KEY"
# default_model = "claude-3-5-sonnet-latest"