- sidecar: FastAPI app with /embed, /generate, /rerank hitting Ollama - sidecar: Dockerfile, env var config (EMBED_MODEL, GEN_MODEL, RERANK_MODEL) - aibridge: reqwest HTTP client with typed request/response structs - aibridge: Axum proxy endpoints (POST /ai/embed, /ai/generate, /ai/rerank) - gateway: wires AiClient with SIDECAR_URL env var - e2e verified: nomic-embed-text returns 768d vectors, qwen2.5 generates text Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
13 lines
288 B
Python
13 lines
288 B
Python
"""Shared Ollama HTTP client."""
|
|
|
|
import os
|
|
|
|
import httpx
|
|
|
|
OLLAMA_URL = os.environ.get("OLLAMA_URL", "http://localhost:11434")
|
|
TIMEOUT = float(os.environ.get("OLLAMA_TIMEOUT", "120"))
|
|
|
|
|
|
def client() -> httpx.AsyncClient:
|
|
return httpx.AsyncClient(base_url=OLLAMA_URL, timeout=TIMEOUT)
|