- sidecar: FastAPI app with /embed, /generate, /rerank hitting Ollama - sidecar: Dockerfile, env var config (EMBED_MODEL, GEN_MODEL, RERANK_MODEL) - aibridge: reqwest HTTP client with typed request/response structs - aibridge: Axum proxy endpoints (POST /ai/embed, /ai/generate, /ai/rerank) - gateway: wires AiClient with SIDECAR_URL env var - e2e verified: nomic-embed-text returns 768d vectors, qwen2.5 generates text Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
18 lines
335 B
Docker
18 lines
335 B
Docker
FROM python:3.13-slim
|
|
|
|
WORKDIR /app
|
|
|
|
COPY pyproject.toml .
|
|
RUN pip install --no-cache-dir .
|
|
|
|
COPY sidecar/ sidecar/
|
|
|
|
ENV OLLAMA_URL=http://host.docker.internal:11434
|
|
ENV EMBED_MODEL=nomic-embed-text
|
|
ENV GEN_MODEL=qwen2.5
|
|
ENV RERANK_MODEL=qwen2.5
|
|
|
|
EXPOSE 3200
|
|
|
|
CMD ["uvicorn", "sidecar.main:app", "--host", "0.0.0.0", "--port", "3200"]
|