# Lakehouse-Go — Replication Runbook How to deploy Lakehouse-Go onto a fresh Linux host. Mirrors the layout the dev box uses; covers prereqs, secrets, systemd units, validation. ## Prereqs The host needs these external services reachable BEFORE the Lakehouse daemons can usefully start. None are managed by Lakehouse-Go's own units; they're operator infrastructure. | Service | Purpose | Reachability | |---|---|---| | **Go 1.25+** | builds the binaries | `go version` returns ≥ 1.25 | | **gcc** | DuckDB cgo (queryd) | `gcc --version` | | **MinIO** (or AWS S3) | storaged backing store | `curl http://localhost:9000/minio/health/live` returns 200; bucket `lakehouse-go-primary` exists | | **Ollama** | embedd + chatd LLM dispatch | `curl http://localhost:11434/api/tags` returns 200 with `nomic-embed-text-v2-moe` (or whatever `[embedd].default_model` names) loaded | | **Langfuse** *(optional)* | trace + span observability | `curl http://localhost:3001/api/public/health` returns 200 | | **PostgreSQL** *(optional)* | only if Langfuse is wanted | bundled with the Langfuse docker compose | Bind ports the daemons use (G0 dev defaults; shifted by 10 from the Rust legacy on 3100/3201–3204 so both stacks coexist): | Daemon | Port | |---|---:| | gateway | 3110 | | storaged | 3211 | | catalogd | 3212 | | ingestd | 3213 | | queryd | 3214 | | vectord | 3215 | | embedd | 3216 | | pathwayd | 3217 | | matrixd | 3218 | | observerd | 3219 | | chatd | 3220 | ## Bootstrap ### 1. User + directories ```bash sudo useradd --system --no-create-home --shell /usr/sbin/nologin lakehouse sudo mkdir -p /var/lib/lakehouse/{pathway,observer} /var/log/lakehouse \ /usr/local/bin/lakehouse /etc/lakehouse sudo chown -R lakehouse:lakehouse /var/lib/lakehouse /var/log/lakehouse ``` ### 2. Build + install binaries From a clone of the repo: ```bash git clone https://git.agentview.dev/profit/golangLAKEHOUSE.git cd golangLAKEHOUSE just verify # vet + tests + 9 core smokes — ~31s go build -o bin/ ./cmd/... # 11 binaries land in ./bin/ sudo cp bin/{gateway,storaged,catalogd,ingestd,queryd,vectord,embedd,pathwayd,observerd,matrixd,chatd} /usr/local/bin/lakehouse/ sudo chmod 755 /usr/local/bin/lakehouse/* ``` ### 3. Config + secrets ```bash # Main config — edit ports/URLs/model tier as needed sudo cp lakehouse.toml /etc/lakehouse/lakehouse.toml # S3 credentials — fill in real keys sudo cp deploy/etc-lakehouse/secrets-go.toml.example /etc/lakehouse/secrets-go.toml sudo chown root:lakehouse /etc/lakehouse/secrets-go.toml sudo chmod 0640 /etc/lakehouse/secrets-go.toml sudo $EDITOR /etc/lakehouse/secrets-go.toml # set [s3.primary] keys # Auth token — required ONLY if any daemon binds non-loopback sudo cp deploy/etc-lakehouse/auth.env.example /etc/lakehouse/auth.env sudo chown root:lakehouse /etc/lakehouse/auth.env sudo chmod 0640 /etc/lakehouse/auth.env # For non-loopback deploys, set: # AUTH_TOKEN= sudo $EDITOR /etc/lakehouse/auth.env # Optional: chatd cloud provider keys, one file per provider # (each is its own EnvironmentFile so rotations don't restart all chatd) for provider in ollama_cloud openrouter opencode kimi; do echo "${provider^^}_API_KEY=" | sudo tee /etc/lakehouse/$provider.env > /dev/null sudo chown root:lakehouse /etc/lakehouse/$provider.env sudo chmod 0640 /etc/lakehouse/$provider.env done sudo $EDITOR /etc/lakehouse/openrouter.env # etc per provider you need ``` ### 4. systemd units ```bash sudo cp deploy/systemd/*.service deploy/systemd/*.target /etc/systemd/system/ sudo systemctl daemon-reload sudo systemctl enable lakehouse-go.target sudo systemctl start lakehouse-go.target ``` ### 5. Validation ```bash # All 11 daemons should be active systemctl status 'lakehouse-*.service' --no-pager | grep -E "Active|●" # Health endpoints respond on each port for port in 3110 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220; do printf "%5d: " "$port" curl -sS --max-time 2 "http://127.0.0.1:$port/health" || echo "FAIL" done # Through the gateway: all chatd providers register (cloud keys present) curl -sS http://127.0.0.1:3110/v1/chat/providers | jq # End-to-end: ingest a tiny CSV → queryd SELECT → matrix.search echo -e "id,name,role\n1,Alice,Forklift Operator" > /tmp/probe.csv curl -sS -F "file=@/tmp/probe.csv" "http://127.0.0.1:3110/v1/ingest?name=probe" curl -sS -X POST http://127.0.0.1:3110/v1/sql \ -H 'content-type: application/json' \ -d '{"sql":"SELECT COUNT(*) FROM probe"}' | jq ``` ## Auth posture Per ADR-006: - **Loopback-only deploy** (every daemon binds 127.0.0.1): no auth needed. Empty `AUTH_TOKEN` is fine. Network is the boundary. - **Non-loopback deploy** (gateway exposed beyond loopback, daemons internal-private): set `AUTH_TOKEN` in `/etc/lakehouse/auth.env`. The mechanical gate at startup refuses to bind without one. - **Multi-host deploy** (gateway + daemons on separate machines): set `AUTH_TOKEN` *and* `[auth].allowed_ips` in lakehouse.toml to the gateway's address. Both layers gate. - **TLS**: terminate at nginx/Caddy in front of the gateway. The Go daemons speak HTTP; in-process TLS is explicitly out of scope per ADR-006 Decision 6.6. ## Token rotation Per ADR-006 Decision 6.5 — dual-token window: ```bash # 1. Generate new token NEW=$(openssl rand -hex 32) # 2. Add as secondary, keep old as primary sudo sed -i "s|^AUTH_SECONDARY_TOKEN=.*|AUTH_SECONDARY_TOKEN=$NEW|" /etc/lakehouse/auth.env sudo systemctl restart lakehouse-go.target # 3. Update every caller to use NEW token # 4. Promote: NEW becomes primary, secondary clears sudo sed -i "s|^AUTH_TOKEN=.*|AUTH_TOKEN=$NEW|" /etc/lakehouse/auth.env sudo sed -i "s|^AUTH_SECONDARY_TOKEN=.*|AUTH_SECONDARY_TOKEN=|" /etc/lakehouse/auth.env sudo systemctl restart lakehouse-go.target ``` ## Docker / docker-compose deploy (alternative to systemd) The single-image `Dockerfile` carries all 11 daemons; `docker-compose.yml` runs one container per daemon with the same dependency graph as the systemd units. Useful when the host doesn't have systemd (Mac dev boxes, remote VMs without root) or when you want all of Lakehouse-Go isolated to a private docker network. ```bash # Build the image (multi-stage; ~3 min on first build, ~30s with # cached go module download). docker build -t lakehouse-go:latest . # Place config + secrets next to docker-compose.yml. The compose file # bind-mounts these into every container at /etc/lakehouse/. cp lakehouse.toml lakehouse.toml # already in repo; edit if needed cp deploy/etc-lakehouse/secrets-go.toml.example secrets-go.toml chmod 0600 secrets-go.toml cp deploy/etc-lakehouse/auth.env.example auth.env chmod 0600 auth.env # Per-provider chatd keys (each its own file so missing == provider # unregistered, NOT chatd startup failure): for p in ollama_cloud openrouter opencode kimi; do echo "${p^^}_API_KEY=" > $p.env chmod 0600 $p.env done # $EDITOR each file to fill in real values... # Bring up the stack. docker compose up -d docker compose ps # all 11 services Healthy docker compose logs -f gateway # Validate via the gateway like the systemd path. curl -sS http://127.0.0.1:3110/v1/chat/providers | jq # Tear down. docker compose down # State volume (pathway/observer JSONLs) survives `down`. To wipe: docker compose down -v ``` ### Key docker-vs-systemd differences | Concern | systemd | docker-compose | |---|---|---| | Process supervision | systemd | tini + docker daemon | | Logs | journald | `docker logs` (or routed to a sink via logging driver) | | Restarts on failure | `Restart=on-failure` | `restart: unless-stopped` | | File ownership | `User=lakehouse` (uid varies) | `user: 999:999` (uid is fixed in the image) | | Reaches MinIO/Ollama | host network | host's address from inside the bridge network — typically `host.docker.internal` (Mac/Win) or `172.17.0.1` (Linux). Set `[s3].endpoint` + `[embedd].provider_url` accordingly. | | Backup target | `/var/lib/lakehouse/` on host | the `lakehouse-state` named volume; bind to a host path via the commented-out `driver_opts` in compose if needed | ## Logs systemd routes everything to journald with per-daemon SyslogIdentifier: ```bash journalctl -u lakehouse-gateway.service -f journalctl -u 'lakehouse-*.service' --since '5 min ago' ``` ## Stopping ```bash sudo systemctl stop lakehouse-go.target # cascades to all 11 daemons ``` ## Backup / state preservation | Path | What | Backup priority | |---|---|---| | `/var/lib/lakehouse/pathway/state.jsonl` | Mem0 trace store (append-only) | high | | `/var/lib/lakehouse/observer/ops.jsonl` | observer ring's persistor backup | medium | | MinIO `lakehouse-go-primary` bucket | parquets, vector LHV1 indexes, catalog manifests | high | | `/etc/lakehouse/lakehouse.toml` | service config | medium | | `/etc/lakehouse/secrets-go.toml` + `*.env` | secrets | high (in your secrets manager, not on disk) | ## Troubleshooting **Daemon refuses to start with "refuse non-loopback bind without auth.token"** ADR-006 6.1 mechanical gate. Set `AUTH_TOKEN` in `/etc/lakehouse/auth.env` or bind back to loopback. **Daemon refuses to start with "refusing non-loopback bind ... see audit R-001"** The previous loopback-bind gate. For dev: `LH__ALLOW_NONLOOPBACK=1` overrides. For prod: set `AUTH_TOKEN` AND keep the override (or move to loopback + reverse-proxy). **catalogd 500 / NoSuchBucket** storaged is pointing at a bucket that doesn't exist. Either create the bucket in MinIO or fix `[s3].bucket` in lakehouse.toml. **embedd 502 on /v1/embed** Ollama not running OR `[embedd].default_model` not loaded. `ollama list` to verify; `ollama pull nomic-embed-text-v2-moe` to load. **chatd `/v1/chat/providers` shows `false` for cloud providers** The provider's env file is missing or empty. Check `/etc/lakehouse/.env`. **queryd unable to read parquet** Check `[queryd].secrets_path` points at the right secrets-go.toml AND the file's owner+mode allow the lakehouse user to read. ## Related docs - `STATE_OF_PLAY.md` — verified-working snapshot - `docs/DECISIONS.md` — all ADRs, especially ADR-003 (auth substrate) + ADR-006 (auth posture) - `docs/SPEC.md` §1 — component table