From 0d18ffa780fb30bf97c6e0808c96e766b1e91632 Mon Sep 17 00:00:00 2001 From: root Date: Wed, 29 Apr 2026 06:05:59 -0500 Subject: [PATCH] =?UTF-8?q?ADR-003:=20inter-service=20auth=20posture=20?= =?UTF-8?q?=E2=80=94=20Bearer=20+=20IP=20allowlist?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Locks in the auth model that R-001 + R-007 will be retrofitted against. Doc-only — wiring deferred to Sprint 1 when the first non-loopback binding is needed. Decision: Bearer token (from secrets-go.toml [auth] section) + IP allowlist (CIDR list). Both layers required when auth is on; empty token = G0 dev no-op. /health exempt. Implementation shape (when it lands): - internal/shared/auth.go middleware: one chi r.Use line per binary - shared.Run gates: refuses non-loopback bind without configured token - subtle.ConstantTimeCompare for token equality (timing-safe) Alternatives considered + rejected: mTLS — too heavy for single-machine inter-service traffic JWT — buys nothing over Bearer without external IdP IP-only — one stolen IP entry = full access; no defense depth OAuth2 — no external IdP commitment in G0-G3 timeline What this doesn't do: - Doesn't implement (code lands Sprint 1) - Doesn't break G0 dev (empty token = middleware no-op) - Doesn't address gateway→end-user auth (different ADR shape) Closes the design-decision blocker for R-001 and R-007. Wiring ticket: Sprint 1 backlog story S1.2. Also lifts ADR-002 (storaged per-prefix PUT cap) into the doc — it was implemented in 423a381 but not yet recorded as an ADR. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/DECISIONS.md | 123 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 122 insertions(+), 1 deletion(-) diff --git a/docs/DECISIONS.md b/docs/DECISIONS.md index 27fa542..5b5afdf 100644 --- a/docs/DECISIONS.md +++ b/docs/DECISIONS.md @@ -121,6 +121,127 @@ historical record. --- -(Future ADRs from ADR-002 onward will be added as the Go +## ADR-002: storaged per-prefix PUT cap (vectord _vectors/ → 4 GiB) +**Date:** 2026-04-29 +**Decided by:** J +**Status:** Implemented (commit `423a381`) + +`storaged` enforces a 256 MiB per-PUT body cap as DoS protection +(`MaxBytesReader` + Content-Length check). Keys under `_vectors/` +(vectord LHV1 persistence) get a raised cap of 4 GiB; everything +else stays at 256 MiB. + +**Rationale:** the 500K staffing test surfaced that single-file LHV1 +above ~150K vectors at d=768 hits the 256 MiB cap. `manager.Uploader` +already streams on the outbound side, so the cap is a safety gate +not a memory bottleneck — raising it for the vector path doesn't +introduce new memory pressure. Per-prefix preserves the safety +gate for routine traffic while opening the documented production +path. Splitting LHV1 across multiple keys was rejected because G1P +specifically shipped the single-Put framed format to eliminate +torn-write — multi-key would re-introduce that failure mode. + +**Follow-up:** if production workloads exceed 4 GiB single-file +LHV1, refactor to operator-driven config (env/TOML) rather than +bumping the constant. The function-level `maxPutBytesFor(key)` in +`cmd/storaged/main.go` keeps that drop-in clean. + +--- + +## ADR-003: Inter-service auth posture — Bearer token + IP allowlist +**Date:** 2026-04-29 +**Decided by:** J + Claude +**Status:** Decided — wiring deferred to Sprint 1 + +**Decision:** When inter-service auth is needed (the moment any +binary binds non-loopback or the deployment crosses a trust +boundary), the auth model is **a Bearer token loaded from +`secrets-go.toml` plus a configurable IP allowlist**. Both layers +required: the token authenticates the caller; the allowlist +narrows the network surface. + +**Status today (G0):** zero auth middleware. Every binary binds +`127.0.0.1` by default; commit `6af0520` (R-001 partial fix) refuses +non-loopback bind unless the per-service `LH__ALLOW_NONLOOPBACK=1` +env override is set. The override-and-no-auth combination is the +worst case — this ADR locks in what we'll require before any +production override fires. + +### What gets implemented when auth lands + +1. **`secrets-go.toml` adds a `[auth]` section:** + ```toml + [auth] + token = "..." # 32+ random bytes, hex-encoded + allowed_ips = ["10.0.0.0/8", "127.0.0.1/32"] # CIDR list + ``` + +2. **`internal/shared/auth.go`** ships a single chi middleware: + ```go + func RequireAuth(cfg AuthConfig) func(http.Handler) http.Handler + ``` + - Empty `cfg.Token` → middleware is a no-op (G0 dev mode). + - Non-empty token → reject 401 unless request has + `Authorization: Bearer ` matching constant-time. + - Non-empty `allowed_ips` → reject 403 unless `r.RemoteAddr` (or + `X-Forwarded-For` first hop, configurable) is in CIDR set. + - `/health` exempt — load balancers + monitors need it open. + +3. **Every `cmd//main.go` adds one line:** + ```go + r.Use(shared.RequireAuth(cfg.Auth)) + ``` + Mounted before `register(r)` so it covers every route the binary + exposes after `/health`. + +4. **`shared.Run` startup gate:** if bind is non-loopback AND + `cfg.Auth.Token == ""`, refuse to start. The implicit + "localhost is the auth layer" guarantee becomes explicit when + crossing the loopback boundary. + +### Alternatives considered + +| Option | Why rejected | +|---|---| +| **mTLS** | Strongest but heaviest — every binary needs cert provisioning, rotation tooling, and cert-aware client wiring. Overkill for inter-service traffic that already passes through a single gateway. Reconsider when Lakehouse-Go runs across machines. | +| **JWT with short TTL** | Buys nothing over Bearer here — there's no third-party identity provider, no claim hierarchy worth modelling. Pure token has the same security properties at half the wire complexity. | +| **No auth, IP-allowlist only** | One stolen IP allowlist entry → full access. Token + IP is defense in depth; either alone is too weak. | +| **OAuth2 via external IdP** | Rejected for G0–G3 timeline. No external IdP commitment. Revisit if Lakehouse-Go ever serves end-user requests directly (today everything fronts through the staffing co-pilot which has its own session model). | + +### Constant-time comparison + token hygiene + +Token comparison must use `crypto/subtle.ConstantTimeCompare` — +naive `==` is vulnerable to timing attacks against an attacker who +can issue many requests and measure round-trip. Token rotation is +operator-driven via `secrets-go.toml` edit + restart; G0 doesn't +need rotate-without-restart. + +### What this ADR does NOT do + +- **Does not implement the middleware.** Code lands in Sprint 1. +- **Does not require token in G0 dev.** Empty token → no-op. Smokes + + proof harness keep working without setting tokens. +- **Does not address gateway → end-user auth.** Gateway terminates + inter-service auth at its inbound; if end-users hit gateway from + a browser, that's a different ADR (likely cookie/session, fronted + by a reverse proxy that handles user auth). + +### How this closes audit findings + +- **R-001 (queryd /sql RCE-equivalent off-loopback):** the bind + gate prevents accidental exposure today; this ADR specifies the + guardrail when intentional exposure is needed. +- **R-007 (zero auth middleware):** answered by the design above; + R-007 stays open until the middleware is implemented but is no + longer "design TBD." +- **R-010 (no CORS posture):** orthogonal to inter-service auth, + but the `RequireAuth` middleware sits at the right layer to add + CORS handling later (browsers don't reach inter-service routes + in the current design, so CORS is also Sprint 1+ when end-user + requests start landing). + +--- + +(Future ADRs from ADR-004 onward will be added as the Go implementation accrues design decisions — e.g. HNSW parameter choices, pathway-memory hash function, auditor model rotation, etc.)