diff --git a/REPLICATION.md b/REPLICATION.md new file mode 100644 index 0000000..2b80d71 --- /dev/null +++ b/REPLICATION.md @@ -0,0 +1,201 @@ +# Lakehouse-Go — Replication Runbook + +How to deploy Lakehouse-Go onto a fresh Linux host. Mirrors the layout +the dev box uses; covers prereqs, secrets, systemd units, validation. + +## Prereqs + +The host needs these external services reachable BEFORE the Lakehouse +daemons can usefully start. None are managed by Lakehouse-Go's own +units; they're operator infrastructure. + +| Service | Purpose | Reachability | +|---|---|---| +| **Go 1.25+** | builds the binaries | `go version` returns ≥ 1.25 | +| **gcc** | DuckDB cgo (queryd) | `gcc --version` | +| **MinIO** (or AWS S3) | storaged backing store | `curl http://localhost:9000/minio/health/live` returns 200; bucket `lakehouse-go-primary` exists | +| **Ollama** | embedd + chatd LLM dispatch | `curl http://localhost:11434/api/tags` returns 200 with `nomic-embed-text-v2-moe` (or whatever `[embedd].default_model` names) loaded | +| **Langfuse** *(optional)* | trace + span observability | `curl http://localhost:3001/api/public/health` returns 200 | +| **PostgreSQL** *(optional)* | only if Langfuse is wanted | bundled with the Langfuse docker compose | + +Bind ports the daemons use (G0 dev defaults; shifted by 10 from the +Rust legacy on 3100/3201–3204 so both stacks coexist): + +| Daemon | Port | +|---|---:| +| gateway | 3110 | +| storaged | 3211 | +| catalogd | 3212 | +| ingestd | 3213 | +| queryd | 3214 | +| vectord | 3215 | +| embedd | 3216 | +| pathwayd | 3217 | +| matrixd | 3218 | +| observerd | 3219 | +| chatd | 3220 | + +## Bootstrap + +### 1. User + directories + +```bash +sudo useradd --system --no-create-home --shell /usr/sbin/nologin lakehouse +sudo mkdir -p /var/lib/lakehouse/{pathway,observer} /var/log/lakehouse \ + /usr/local/bin/lakehouse /etc/lakehouse +sudo chown -R lakehouse:lakehouse /var/lib/lakehouse /var/log/lakehouse +``` + +### 2. Build + install binaries + +From a clone of the repo: + +```bash +git clone https://git.agentview.dev/profit/golangLAKEHOUSE.git +cd golangLAKEHOUSE +just verify # vet + tests + 9 core smokes — ~31s +go build -o bin/ ./cmd/... # 11 binaries land in ./bin/ +sudo cp bin/{gateway,storaged,catalogd,ingestd,queryd,vectord,embedd,pathwayd,observerd,matrixd,chatd} /usr/local/bin/lakehouse/ +sudo chmod 755 /usr/local/bin/lakehouse/* +``` + +### 3. Config + secrets + +```bash +# Main config — edit ports/URLs/model tier as needed +sudo cp lakehouse.toml /etc/lakehouse/lakehouse.toml + +# S3 credentials — fill in real keys +sudo cp deploy/etc-lakehouse/secrets-go.toml.example /etc/lakehouse/secrets-go.toml +sudo chown root:lakehouse /etc/lakehouse/secrets-go.toml +sudo chmod 0640 /etc/lakehouse/secrets-go.toml +sudo $EDITOR /etc/lakehouse/secrets-go.toml # set [s3.primary] keys + +# Auth token — required ONLY if any daemon binds non-loopback +sudo cp deploy/etc-lakehouse/auth.env.example /etc/lakehouse/auth.env +sudo chown root:lakehouse /etc/lakehouse/auth.env +sudo chmod 0640 /etc/lakehouse/auth.env +# For non-loopback deploys, set: +# AUTH_TOKEN= +sudo $EDITOR /etc/lakehouse/auth.env + +# Optional: chatd cloud provider keys, one file per provider +# (each is its own EnvironmentFile so rotations don't restart all chatd) +for provider in ollama_cloud openrouter opencode kimi; do + echo "${provider^^}_API_KEY=" | sudo tee /etc/lakehouse/$provider.env > /dev/null + sudo chown root:lakehouse /etc/lakehouse/$provider.env + sudo chmod 0640 /etc/lakehouse/$provider.env +done +sudo $EDITOR /etc/lakehouse/openrouter.env # etc per provider you need +``` + +### 4. systemd units + +```bash +sudo cp deploy/systemd/*.service deploy/systemd/*.target /etc/systemd/system/ +sudo systemctl daemon-reload +sudo systemctl enable lakehouse-go.target +sudo systemctl start lakehouse-go.target +``` + +### 5. Validation + +```bash +# All 11 daemons should be active +systemctl status 'lakehouse-*.service' --no-pager | grep -E "Active|●" + +# Health endpoints respond on each port +for port in 3110 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220; do + printf "%5d: " "$port" + curl -sS --max-time 2 "http://127.0.0.1:$port/health" || echo "FAIL" +done + +# Through the gateway: all chatd providers register (cloud keys present) +curl -sS http://127.0.0.1:3110/v1/chat/providers | jq + +# End-to-end: ingest a tiny CSV → queryd SELECT → matrix.search +echo -e "id,name,role\n1,Alice,Forklift Operator" > /tmp/probe.csv +curl -sS -F "file=@/tmp/probe.csv" "http://127.0.0.1:3110/v1/ingest?name=probe" +curl -sS -X POST http://127.0.0.1:3110/v1/sql \ + -H 'content-type: application/json' \ + -d '{"sql":"SELECT COUNT(*) FROM probe"}' | jq +``` + +## Auth posture + +Per ADR-006: + +- **Loopback-only deploy** (every daemon binds 127.0.0.1): no auth needed. Empty `AUTH_TOKEN` is fine. Network is the boundary. +- **Non-loopback deploy** (gateway exposed beyond loopback, daemons internal-private): set `AUTH_TOKEN` in `/etc/lakehouse/auth.env`. The mechanical gate at startup refuses to bind without one. +- **Multi-host deploy** (gateway + daemons on separate machines): set `AUTH_TOKEN` *and* `[auth].allowed_ips` in lakehouse.toml to the gateway's address. Both layers gate. +- **TLS**: terminate at nginx/Caddy in front of the gateway. The Go daemons speak HTTP; in-process TLS is explicitly out of scope per ADR-006 Decision 6.6. + +## Token rotation + +Per ADR-006 Decision 6.5 — dual-token window: + +```bash +# 1. Generate new token +NEW=$(openssl rand -hex 32) + +# 2. Add as secondary, keep old as primary +sudo sed -i "s|^AUTH_SECONDARY_TOKEN=.*|AUTH_SECONDARY_TOKEN=$NEW|" /etc/lakehouse/auth.env +sudo systemctl restart lakehouse-go.target + +# 3. Update every caller to use NEW token +# 4. Promote: NEW becomes primary, secondary clears +sudo sed -i "s|^AUTH_TOKEN=.*|AUTH_TOKEN=$NEW|" /etc/lakehouse/auth.env +sudo sed -i "s|^AUTH_SECONDARY_TOKEN=.*|AUTH_SECONDARY_TOKEN=|" /etc/lakehouse/auth.env +sudo systemctl restart lakehouse-go.target +``` + +## Logs + +systemd routes everything to journald with per-daemon SyslogIdentifier: + +```bash +journalctl -u lakehouse-gateway.service -f +journalctl -u 'lakehouse-*.service' --since '5 min ago' +``` + +## Stopping + +```bash +sudo systemctl stop lakehouse-go.target # cascades to all 11 daemons +``` + +## Backup / state preservation + +| Path | What | Backup priority | +|---|---|---| +| `/var/lib/lakehouse/pathway/state.jsonl` | Mem0 trace store (append-only) | high | +| `/var/lib/lakehouse/observer/ops.jsonl` | observer ring's persistor backup | medium | +| MinIO `lakehouse-go-primary` bucket | parquets, vector LHV1 indexes, catalog manifests | high | +| `/etc/lakehouse/lakehouse.toml` | service config | medium | +| `/etc/lakehouse/secrets-go.toml` + `*.env` | secrets | high (in your secrets manager, not on disk) | + +## Troubleshooting + +**Daemon refuses to start with "refuse non-loopback bind without auth.token"** +ADR-006 6.1 mechanical gate. Set `AUTH_TOKEN` in `/etc/lakehouse/auth.env` or bind back to loopback. + +**Daemon refuses to start with "refusing non-loopback bind ... see audit R-001"** +The previous loopback-bind gate. For dev: `LH__ALLOW_NONLOOPBACK=1` overrides. For prod: set `AUTH_TOKEN` AND keep the override (or move to loopback + reverse-proxy). + +**catalogd 500 / NoSuchBucket** +storaged is pointing at a bucket that doesn't exist. Either create the bucket in MinIO or fix `[s3].bucket` in lakehouse.toml. + +**embedd 502 on /v1/embed** +Ollama not running OR `[embedd].default_model` not loaded. `ollama list` to verify; `ollama pull nomic-embed-text-v2-moe` to load. + +**chatd `/v1/chat/providers` shows `false` for cloud providers** +The provider's env file is missing or empty. Check `/etc/lakehouse/.env`. + +**queryd unable to read parquet** +Check `[queryd].secrets_path` points at the right secrets-go.toml AND the file's owner+mode allow the lakehouse user to read. + +## Related docs + +- `STATE_OF_PLAY.md` — verified-working snapshot +- `docs/DECISIONS.md` — all ADRs, especially ADR-003 (auth substrate) + ADR-006 (auth posture) +- `docs/SPEC.md` §1 — component table diff --git a/deploy/etc-lakehouse/auth.env.example b/deploy/etc-lakehouse/auth.env.example new file mode 100644 index 0000000..9c4028c --- /dev/null +++ b/deploy/etc-lakehouse/auth.env.example @@ -0,0 +1,23 @@ +# /etc/lakehouse/auth.env — inter-service auth token per ADR-006. +# +# Mode 0600, owned by the lakehouse user (or root if systemd reads it +# before dropping privileges via User=). Loaded by every daemon's +# systemd unit via EnvironmentFile=-/etc/lakehouse/auth.env (the `-` +# prefix means "missing file is OK" so loopback-only deploys can skip +# this entirely). +# +# When the daemon binds non-loopback (anything other than 127.0.0.0/8 +# or ::1), AUTH_TOKEN MUST be set — otherwise shared.Run refuses to +# start (R-001 + R-007 mechanical gate). Loopback-only deploys can +# leave this empty. +# +# Token rotation (ADR-006 Decision 6.5): +# 1. Generate new secret +# 2. Set AUTH_SECONDARY_TOKEN to new secret while AUTH_TOKEN stays +# on old (lakehouse.toml [auth].secondary_tokens reads this) +# 3. Update every caller to use new secret +# 4. Promote: AUTH_TOKEN=, clear AUTH_SECONDARY_TOKEN +# 5. Restart daemons (or SIGHUP once hot-reload lands) + +AUTH_TOKEN= +# AUTH_SECONDARY_TOKEN= # only set during rotation windows diff --git a/deploy/etc-lakehouse/secrets-go.toml.example b/deploy/etc-lakehouse/secrets-go.toml.example new file mode 100644 index 0000000..4bc6c9b --- /dev/null +++ b/deploy/etc-lakehouse/secrets-go.toml.example @@ -0,0 +1,25 @@ +# /etc/lakehouse/secrets-go.toml — per-bucket S3 credentials. +# +# Mode 0600, root-owned (storaged + queryd both need to read it; they +# run as the lakehouse user, so chgrp lakehouse + 0640 is fine if the +# group is restricted). +# +# Schema is one [s3.] block per bucket the +# storaged BucketRegistry should serve. G0 is single-bucket +# ("primary" — see cmd/storaged/main.go primaryBucket const). G2 +# multi-bucket federation will add more entries here. +# +# This file is DELIBERATELY NOT in version control. Operators copy +# this template, fill in real credentials, and place at +# /etc/lakehouse/secrets-go.toml. The committed lakehouse.toml [s3] +# block has bucket= + endpoint= + region= +# — only the credentials live here. + +[s3.primary] +access_key_id = "REPLACE_ME" +secret_access_key = "REPLACE_ME" + +# Future G2 example — multiple buckets: +# [s3.archive] +# access_key_id = "..." +# secret_access_key = "..." diff --git a/deploy/systemd/lakehouse-catalogd.service b/deploy/systemd/lakehouse-catalogd.service new file mode 100644 index 0000000..7deba9f --- /dev/null +++ b/deploy/systemd/lakehouse-catalogd.service @@ -0,0 +1,30 @@ +[Unit] +Description=Lakehouse-Go catalogd — Parquet manifest registry +Documentation=https://git.agentview.dev/profit/golangLAKEHOUSE +After=network-online.target lakehouse-storaged.service +Wants=network-online.target +Requires=lakehouse-storaged.service + +[Service] +Type=simple +User=lakehouse +Group=lakehouse +WorkingDirectory=/var/lib/lakehouse +ExecStart=/usr/local/bin/lakehouse/catalogd -config /etc/lakehouse/lakehouse.toml +Restart=on-failure +RestartSec=5 + +EnvironmentFile=-/etc/lakehouse/auth.env + +NoNewPrivileges=true +ProtectSystem=strict +ProtectHome=true +PrivateTmp=true +ReadWritePaths=/var/lib/lakehouse /var/log/lakehouse + +StandardOutput=journal +StandardError=journal +SyslogIdentifier=lakehouse-catalogd + +[Install] +WantedBy=lakehouse-go.target diff --git a/deploy/systemd/lakehouse-chatd.service b/deploy/systemd/lakehouse-chatd.service new file mode 100644 index 0000000..142774e --- /dev/null +++ b/deploy/systemd/lakehouse-chatd.service @@ -0,0 +1,41 @@ +[Unit] +Description=Lakehouse-Go chatd — multi-provider LLM dispatcher +Documentation=https://git.agentview.dev/profit/golangLAKEHOUSE +After=network-online.target +Wants=network-online.target +# Operator prereq: Ollama on localhost:11434 for the bare/ollama/ +# providers; cloud providers (ollama_cloud, openrouter, opencode, +# kimi) read keys from /etc/lakehouse/.env per chatd +# config. Missing key files leave that provider unregistered (404 +# at first call, never 503). + +[Service] +Type=simple +User=lakehouse +Group=lakehouse +WorkingDirectory=/var/lib/lakehouse +ExecStart=/usr/local/bin/lakehouse/chatd -config /etc/lakehouse/lakehouse.toml +Restart=on-failure +RestartSec=5 + +EnvironmentFile=-/etc/lakehouse/auth.env +# chatd reads provider key files via paths in lakehouse.toml [chatd] +# (ollama_cloud_key_file etc.) — each is its own EnvironmentFile so +# operators can rotate one provider without restarting others. +EnvironmentFile=-/etc/lakehouse/ollama_cloud.env +EnvironmentFile=-/etc/lakehouse/openrouter.env +EnvironmentFile=-/etc/lakehouse/opencode.env +EnvironmentFile=-/etc/lakehouse/kimi.env + +NoNewPrivileges=true +ProtectSystem=strict +ProtectHome=true +PrivateTmp=true +ReadWritePaths=/var/lib/lakehouse /var/log/lakehouse + +StandardOutput=journal +StandardError=journal +SyslogIdentifier=lakehouse-chatd + +[Install] +WantedBy=lakehouse-go.target diff --git a/deploy/systemd/lakehouse-embedd.service b/deploy/systemd/lakehouse-embedd.service new file mode 100644 index 0000000..8d917c5 --- /dev/null +++ b/deploy/systemd/lakehouse-embedd.service @@ -0,0 +1,33 @@ +[Unit] +Description=Lakehouse-Go embedd — text→vector via Ollama +Documentation=https://git.agentview.dev/profit/golangLAKEHOUSE +After=network-online.target +Wants=network-online.target +# Operator prereq: Ollama running at the URL in lakehouse.toml +# [embedd].provider_url, with default_model loaded (e.g. +# `ollama pull nomic-embed-text-v2-moe`). Not a systemd unit we +# control; embedd surfaces unreachable-Ollama as 502 at request time. + +[Service] +Type=simple +User=lakehouse +Group=lakehouse +WorkingDirectory=/var/lib/lakehouse +ExecStart=/usr/local/bin/lakehouse/embedd -config /etc/lakehouse/lakehouse.toml +Restart=on-failure +RestartSec=5 + +EnvironmentFile=-/etc/lakehouse/auth.env + +NoNewPrivileges=true +ProtectSystem=strict +ProtectHome=true +PrivateTmp=true +ReadWritePaths=/var/lib/lakehouse /var/log/lakehouse + +StandardOutput=journal +StandardError=journal +SyslogIdentifier=lakehouse-embedd + +[Install] +WantedBy=lakehouse-go.target diff --git a/deploy/systemd/lakehouse-gateway.service b/deploy/systemd/lakehouse-gateway.service new file mode 100644 index 0000000..63c17ea --- /dev/null +++ b/deploy/systemd/lakehouse-gateway.service @@ -0,0 +1,32 @@ +[Unit] +Description=Lakehouse-Go gateway — single OpenAI-compat-shaped edge proxy +Documentation=https://git.agentview.dev/profit/golangLAKEHOUSE +After=network-online.target lakehouse-storaged.service lakehouse-catalogd.service lakehouse-ingestd.service lakehouse-queryd.service lakehouse-vectord.service lakehouse-embedd.service lakehouse-pathwayd.service lakehouse-observerd.service lakehouse-matrixd.service lakehouse-chatd.service +Wants=network-online.target lakehouse-storaged.service lakehouse-catalogd.service lakehouse-ingestd.service lakehouse-queryd.service lakehouse-vectord.service lakehouse-embedd.service lakehouse-pathwayd.service lakehouse-observerd.service lakehouse-matrixd.service lakehouse-chatd.service +# gateway is the public-facing edge — if any upstream is down, the +# proxy returns 502 at request time. Wants= (not Requires=) so a +# single upstream restart doesn't cascade-restart the gateway. + +[Service] +Type=simple +User=lakehouse +Group=lakehouse +WorkingDirectory=/var/lib/lakehouse +ExecStart=/usr/local/bin/lakehouse/gateway -config /etc/lakehouse/lakehouse.toml +Restart=on-failure +RestartSec=5 + +EnvironmentFile=-/etc/lakehouse/auth.env + +NoNewPrivileges=true +ProtectSystem=strict +ProtectHome=true +PrivateTmp=true +ReadWritePaths=/var/lib/lakehouse /var/log/lakehouse + +StandardOutput=journal +StandardError=journal +SyslogIdentifier=lakehouse-gateway + +[Install] +WantedBy=lakehouse-go.target diff --git a/deploy/systemd/lakehouse-go.target b/deploy/systemd/lakehouse-go.target new file mode 100644 index 0000000..f1aeba3 --- /dev/null +++ b/deploy/systemd/lakehouse-go.target @@ -0,0 +1,11 @@ +[Unit] +Description=Lakehouse-Go — all 11 daemons (gateway + 10 backing services) +Documentation=https://git.agentview.dev/profit/golangLAKEHOUSE +# A single target operators can `systemctl start` / `stop` / `enable` +# instead of touching each daemon individually. Per-daemon units list +# this target in their [Install].WantedBy=, so `systemctl enable +# lakehouse-go.target` enables every daemon's unit, and a target +# start cascades through Wants=. + +[Install] +WantedBy=multi-user.target diff --git a/deploy/systemd/lakehouse-ingestd.service b/deploy/systemd/lakehouse-ingestd.service new file mode 100644 index 0000000..d5323ae --- /dev/null +++ b/deploy/systemd/lakehouse-ingestd.service @@ -0,0 +1,30 @@ +[Unit] +Description=Lakehouse-Go ingestd — CSV → Parquet → catalogd registration +Documentation=https://git.agentview.dev/profit/golangLAKEHOUSE +After=network-online.target lakehouse-storaged.service lakehouse-catalogd.service +Wants=network-online.target +Requires=lakehouse-storaged.service lakehouse-catalogd.service + +[Service] +Type=simple +User=lakehouse +Group=lakehouse +WorkingDirectory=/var/lib/lakehouse +ExecStart=/usr/local/bin/lakehouse/ingestd -config /etc/lakehouse/lakehouse.toml +Restart=on-failure +RestartSec=5 + +EnvironmentFile=-/etc/lakehouse/auth.env + +NoNewPrivileges=true +ProtectSystem=strict +ProtectHome=true +PrivateTmp=true +ReadWritePaths=/var/lib/lakehouse /var/log/lakehouse + +StandardOutput=journal +StandardError=journal +SyslogIdentifier=lakehouse-ingestd + +[Install] +WantedBy=lakehouse-go.target diff --git a/deploy/systemd/lakehouse-matrixd.service b/deploy/systemd/lakehouse-matrixd.service new file mode 100644 index 0000000..9b868e3 --- /dev/null +++ b/deploy/systemd/lakehouse-matrixd.service @@ -0,0 +1,30 @@ +[Unit] +Description=Lakehouse-Go matrixd — multi-corpus retrieve+merge with Shape B playbook +Documentation=https://git.agentview.dev/profit/golangLAKEHOUSE +After=network-online.target lakehouse-embedd.service lakehouse-vectord.service +Wants=network-online.target +Requires=lakehouse-embedd.service lakehouse-vectord.service + +[Service] +Type=simple +User=lakehouse +Group=lakehouse +WorkingDirectory=/var/lib/lakehouse +ExecStart=/usr/local/bin/lakehouse/matrixd -config /etc/lakehouse/lakehouse.toml +Restart=on-failure +RestartSec=5 + +EnvironmentFile=-/etc/lakehouse/auth.env + +NoNewPrivileges=true +ProtectSystem=strict +ProtectHome=true +PrivateTmp=true +ReadWritePaths=/var/lib/lakehouse /var/log/lakehouse + +StandardOutput=journal +StandardError=journal +SyslogIdentifier=lakehouse-matrixd + +[Install] +WantedBy=lakehouse-go.target diff --git a/deploy/systemd/lakehouse-observerd.service b/deploy/systemd/lakehouse-observerd.service new file mode 100644 index 0000000..fe01fd6 --- /dev/null +++ b/deploy/systemd/lakehouse-observerd.service @@ -0,0 +1,34 @@ +[Unit] +Description=Lakehouse-Go observerd — witness ring + workflow runner + inbox +Documentation=https://git.agentview.dev/profit/golangLAKEHOUSE +After=network-online.target +Wants=network-online.target +# observerd CAN call matrixd (workflow modes that hit matrix.search) +# but doesn't strictly require it — modes that fail at startup are +# logged and the daemon keeps running. So no Requires= here. + +[Service] +Type=simple +User=lakehouse +Group=lakehouse +WorkingDirectory=/var/lib/lakehouse +ExecStart=/usr/local/bin/lakehouse/observerd -config /etc/lakehouse/lakehouse.toml +Restart=on-failure +RestartSec=5 + +EnvironmentFile=-/etc/lakehouse/auth.env + +NoNewPrivileges=true +ProtectSystem=strict +ProtectHome=true +PrivateTmp=true +# observerd's [observerd].persist_path defaults under +# /var/lib/lakehouse/observer/ for ops.jsonl persistence. +ReadWritePaths=/var/lib/lakehouse /var/log/lakehouse + +StandardOutput=journal +StandardError=journal +SyslogIdentifier=lakehouse-observerd + +[Install] +WantedBy=lakehouse-go.target diff --git a/deploy/systemd/lakehouse-pathwayd.service b/deploy/systemd/lakehouse-pathwayd.service new file mode 100644 index 0000000..ee9a0e4 --- /dev/null +++ b/deploy/systemd/lakehouse-pathwayd.service @@ -0,0 +1,31 @@ +[Unit] +Description=Lakehouse-Go pathwayd — Mem0-style versioned trace store +Documentation=https://git.agentview.dev/profit/golangLAKEHOUSE +After=network-online.target +Wants=network-online.target + +[Service] +Type=simple +User=lakehouse +Group=lakehouse +WorkingDirectory=/var/lib/lakehouse +ExecStart=/usr/local/bin/lakehouse/pathwayd -config /etc/lakehouse/lakehouse.toml +Restart=on-failure +RestartSec=5 + +EnvironmentFile=-/etc/lakehouse/auth.env + +NoNewPrivileges=true +ProtectSystem=strict +ProtectHome=true +PrivateTmp=true +# pathwayd's [pathwayd].persist_path defaults under /var/lib/lakehouse/pathway/ +# in production for JSONL append-only persistence. +ReadWritePaths=/var/lib/lakehouse /var/log/lakehouse + +StandardOutput=journal +StandardError=journal +SyslogIdentifier=lakehouse-pathwayd + +[Install] +WantedBy=lakehouse-go.target diff --git a/deploy/systemd/lakehouse-queryd.service b/deploy/systemd/lakehouse-queryd.service new file mode 100644 index 0000000..9e62822 --- /dev/null +++ b/deploy/systemd/lakehouse-queryd.service @@ -0,0 +1,33 @@ +[Unit] +Description=Lakehouse-Go queryd — DuckDB-over-S3 SQL surface +Documentation=https://git.agentview.dev/profit/golangLAKEHOUSE +After=network-online.target lakehouse-catalogd.service +Wants=network-online.target +Requires=lakehouse-catalogd.service + +[Service] +Type=simple +User=lakehouse +Group=lakehouse +WorkingDirectory=/var/lib/lakehouse +# queryd's DuckDB httpfs path needs S3 credentials; secrets_path +# in lakehouse.toml [queryd] points at /etc/lakehouse/secrets-go.toml +# which storaged also reads. +ExecStart=/usr/local/bin/lakehouse/queryd -config /etc/lakehouse/lakehouse.toml +Restart=on-failure +RestartSec=5 + +EnvironmentFile=-/etc/lakehouse/auth.env + +NoNewPrivileges=true +ProtectSystem=strict +ProtectHome=true +PrivateTmp=true +ReadWritePaths=/var/lib/lakehouse /var/log/lakehouse + +StandardOutput=journal +StandardError=journal +SyslogIdentifier=lakehouse-queryd + +[Install] +WantedBy=lakehouse-go.target diff --git a/deploy/systemd/lakehouse-storaged.service b/deploy/systemd/lakehouse-storaged.service new file mode 100644 index 0000000..2d283d4 --- /dev/null +++ b/deploy/systemd/lakehouse-storaged.service @@ -0,0 +1,39 @@ +[Unit] +Description=Lakehouse-Go storaged — S3-backed object store gateway +Documentation=https://git.agentview.dev/profit/golangLAKEHOUSE +After=network-online.target +Wants=network-online.target +# Operator prereq: MinIO (or AWS S3) reachable at the URL in +# /etc/lakehouse/secrets-go.toml [s3.primary]. Not a systemd unit +# we control, so we just wait for network and let the bind+probe +# in storaged surface unreachable-bucket errors. + +[Service] +Type=simple +User=lakehouse +Group=lakehouse +WorkingDirectory=/var/lib/lakehouse +ExecStart=/usr/local/bin/lakehouse/storaged -config /etc/lakehouse/lakehouse.toml -secrets /etc/lakehouse/secrets-go.toml +Restart=on-failure +RestartSec=5 + +# Per ADR-006 Decision 6.2: auth token from env, not committed TOML. +# Empty AUTH_TOKEN is fine for loopback-only deploys (matches +# requireAuthOnNonLoopback gate at startup). +EnvironmentFile=-/etc/lakehouse/auth.env + +# Hardening — minimum needed for the daemon to read its config +# + write its log + open its bind port. +NoNewPrivileges=true +ProtectSystem=strict +ProtectHome=true +PrivateTmp=true +ReadWritePaths=/var/lib/lakehouse /var/log/lakehouse + +# Log routing — JSON to journald, structured per slogRequest middleware. +StandardOutput=journal +StandardError=journal +SyslogIdentifier=lakehouse-storaged + +[Install] +WantedBy=lakehouse-go.target diff --git a/deploy/systemd/lakehouse-vectord.service b/deploy/systemd/lakehouse-vectord.service new file mode 100644 index 0000000..2bda7de --- /dev/null +++ b/deploy/systemd/lakehouse-vectord.service @@ -0,0 +1,32 @@ +[Unit] +Description=Lakehouse-Go vectord — HNSW vector index registry +Documentation=https://git.agentview.dev/profit/golangLAKEHOUSE +After=network-online.target +Wants=network-online.target +# storaged is OPTIONAL for vectord — when [vectord].storaged_url is +# empty, indexes are in-memory-only and don't survive restart. When +# set, vectord persists LHV1 files via storaged. Operator's call. + +[Service] +Type=simple +User=lakehouse +Group=lakehouse +WorkingDirectory=/var/lib/lakehouse +ExecStart=/usr/local/bin/lakehouse/vectord -config /etc/lakehouse/lakehouse.toml +Restart=on-failure +RestartSec=5 + +EnvironmentFile=-/etc/lakehouse/auth.env + +NoNewPrivileges=true +ProtectSystem=strict +ProtectHome=true +PrivateTmp=true +ReadWritePaths=/var/lib/lakehouse /var/log/lakehouse + +StandardOutput=journal +StandardError=journal +SyslogIdentifier=lakehouse-vectord + +[Install] +WantedBy=lakehouse-go.target