Parallel deploy target to the systemd units that landed in a59ef5b.
Single image carries all 11 daemons; docker-compose runs one
container per daemon with the same dependency graph as the systemd
units. Useful when systemd isn't available (Mac dev, remote VMs
without root) or when isolation to a private docker network is
preferred.
Dockerfile (multi-stage):
- Builder: golang:1.25-bookworm. DuckDB cgo needs gcc + glibc;
alpine's musl doesn't link the official duckdb-go bindings cleanly.
- Runtime: debian:bookworm-slim — same libc, much smaller surface.
Adds ca-certificates (outbound HTTPS to OpenRouter/OpenCode/Kimi),
curl + jq (in-container healthchecks + smoke probes), tini (PID 1
signal forwarding so docker stop sends SIGTERM to the daemon, not
to a wrapper).
- Single image, multiple binaries. Ships all 11 cmd/* + 3 scripts/
(staffing_workers, playbook_lift, multi_coord_stress) so deployed
stacks can run reality tests against themselves.
- Non-root runtime user (uid 999 lakehouse). Layout matches
/usr/local/bin/lakehouse/<daemon> from REPLICATION.md.
- ENTRYPOINT=tini; no default CMD — operators / compose pick
which daemon explicitly.
docker-compose.yml (11 services):
- Same dependency graph as deploy/systemd/. depends_on with
service_healthy condition matches Requires= equivalents:
catalogd → storaged
ingestd → storaged + catalogd
queryd → catalogd
matrixd → embedd + vectord
- Gateway uses bare depends_on (no health condition) — Wants=
equivalent so single-upstream restart doesn't cascade.
- chatd has per-provider env_file entries (one each for
ollama_cloud, openrouter, opencode, kimi) — missing files are
silently OK, matching the systemd unit's EnvironmentFile=- list.
- Persistent state on the lakehouse-state named volume; commented
driver_opts shows how to bind to a host path for off-volume
backups.
.dockerignore:
- Excludes bin/ + reports/ + data/ + git metadata + .env files.
- Especially excludes lakehouse.toml/secrets-go.toml/auth.env so
local dev configs don't accidentally bake into a published image.
REPLICATION.md gains a Docker section between systemd setup and
the logs section. Ten-line copy-paste from "git clone" to
"docker compose up -d", plus a docker-vs-systemd differences
table covering process supervision, logs, restart policy, file
ownership, host networking quirks, and backup targets.
Validation: docker compose config --quiet → exit 0 (with
placeholder env files in place).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
90 lines
3.6 KiB
Docker
90 lines
3.6 KiB
Docker
# syntax=docker/dockerfile:1.6
|
|
#
|
|
# Multi-stage Dockerfile for Lakehouse-Go.
|
|
#
|
|
# Single image carries all 11 daemon binaries; docker-compose runs
|
|
# one container per daemon (matches the systemd unit topology in
|
|
# deploy/systemd/). Operators can also `docker run lakehouse-go
|
|
# /usr/local/bin/lakehouse/<daemon>` to invoke any one daemon
|
|
# directly.
|
|
#
|
|
# Builder uses golang:1.25-bookworm (DuckDB cgo needs gcc + glibc;
|
|
# alpine's musl doesn't link the official duckdb-go bindings cleanly).
|
|
# Runtime is debian:bookworm-slim — same libc, much smaller surface.
|
|
#
|
|
# Build:
|
|
# docker build -t lakehouse-go:latest .
|
|
# Or with a tag:
|
|
# docker build -t lakehouse-go:$(git rev-parse --short HEAD) .
|
|
|
|
# ── Stage 1: builder ────────────────────────────────────────────
|
|
FROM golang:1.25-bookworm AS builder
|
|
|
|
# build-essential pulls gcc + make + libc-dev — DuckDB cgo needs all three.
|
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
build-essential \
|
|
ca-certificates \
|
|
&& rm -rf /var/lib/apt/lists/*
|
|
|
|
WORKDIR /src
|
|
|
|
# Copy go.mod + go.sum first so module download is cacheable across
|
|
# source-only changes.
|
|
COPY go.mod go.sum ./
|
|
RUN go mod download
|
|
|
|
# Source.
|
|
COPY . .
|
|
|
|
# Build all 11 daemon binaries + the staffing_workers script (used
|
|
# by the multi_coord_stress harness; ships in the same image so
|
|
# operators can run reality tests against a deployed stack).
|
|
RUN go build -trimpath -o /out/ \
|
|
./cmd/storaged ./cmd/catalogd ./cmd/ingestd ./cmd/queryd \
|
|
./cmd/embedd ./cmd/vectord ./cmd/pathwayd ./cmd/observerd \
|
|
./cmd/matrixd ./cmd/gateway ./cmd/chatd \
|
|
./scripts/staffing_workers ./scripts/playbook_lift ./scripts/multi_coord_stress
|
|
|
|
# ── Stage 2: runtime ────────────────────────────────────────────
|
|
FROM debian:bookworm-slim
|
|
|
|
# CA certs for outbound HTTPS (Ollama Cloud, OpenRouter, OpenCode,
|
|
# Kimi). curl + jq for in-container health checks + smoke probes.
|
|
# tini handles PID 1 signal forwarding so docker stop sends SIGTERM
|
|
# to the actual daemon, not just to a wrapper.
|
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
ca-certificates \
|
|
curl \
|
|
jq \
|
|
tini \
|
|
&& rm -rf /var/lib/apt/lists/*
|
|
|
|
# Non-root runtime user — same name as the systemd User= directive
|
|
# in deploy/systemd/, so file ownership stays consistent across
|
|
# deployment modes (docker-compose vs systemd).
|
|
RUN groupadd --system --gid 999 lakehouse \
|
|
&& useradd --system --uid 999 --gid 999 \
|
|
--no-create-home --shell /usr/sbin/nologin lakehouse
|
|
|
|
# Layout matches /usr/local/bin/lakehouse/<daemon> from REPLICATION.md
|
|
# so docs apply equally to systemd + docker deployments.
|
|
COPY --from=builder /out/* /usr/local/bin/lakehouse/
|
|
|
|
# /var/lib/lakehouse for pathway/observer JSONLs; /var/log/lakehouse
|
|
# in case operators want file logs in addition to docker logs.
|
|
RUN mkdir -p /var/lib/lakehouse/pathway /var/lib/lakehouse/observer /var/log/lakehouse \
|
|
&& chown -R lakehouse:lakehouse /var/lib/lakehouse /var/log/lakehouse
|
|
|
|
USER lakehouse
|
|
WORKDIR /var/lib/lakehouse
|
|
|
|
# No default CMD — operators (or docker-compose) MUST specify which
|
|
# daemon. Forces explicit topology rather than implicit "run
|
|
# everything in one container."
|
|
ENTRYPOINT ["/usr/bin/tini", "--"]
|
|
|
|
# Default healthcheck targets gateway's port. Per-service compose
|
|
# overrides land per their own port.
|
|
HEALTHCHECK --interval=10s --timeout=2s --start-period=5s --retries=3 \
|
|
CMD curl -sSf http://127.0.0.1:3110/health || exit 1
|