# golangLAKEHOUSE — Sprint Backlog Five sprints adapted from SCRUM.md's framework. Each sprint has a goal, user stories, and acceptance criteria. Risk IDs reference `risk-register.md`. Acceptance-of-done details live in `acceptance-gates.md`. The audit is the work of *this* turn; these sprints are the next turns. Order matters — Sprint 0 unblocks the rest by making the substrate provably runnable on a clean box. --- ## Sprint 0 — Reproducibility Gate **Goal:** make the repo provably runnable, with structural protection against silent regressions in the load-bearing-but-untested layers. **Risks closed:** R-002, R-003, R-004, R-005, R-006, R-008, R-012. ### Stories - **S0.1** — As an operator, I can run **one command** and know exactly which dependencies are missing or wrong-versioned. - Concrete: `just doctor` checks Go ≥1.25, gcc, MinIO at `:9000`, Ollama at `:11434` with `nomic-embed-text` loaded, `secrets-go.toml` present + readable. Output is structured JSON on `--json` flag. Non-zero exit on any missing dep. - **S0.2** — As an operator, I can run a **minimal fixture test** without MinIO or Ollama. - Concrete: `just smoke-fixtures` runs against in-process fakes (`MockS3Storage` + `MockEmbedProvider`). Smokes split into two tiers: `*_smoke.sh` (real services, slow) vs `*_smoke_fixtures.sh` (fakes, runs anywhere). - **S0.3** — As an operator, I can verify the whole substrate with one command, and I cannot push a regression past it. - Concrete: `just verify` runs `go vet` + `go test` + the 9-smoke chain. `.git/hooks/pre-push` calls `just verify` and aborts on non-zero exit. Failure output is structured. - **S0.4** — As a reviewer, I can read coverage at a glance and see where wiring layers lack tests. - Concrete: `cmd//main_test.go` exists for all 7 binaries (today: only `storaged`). Each tests routes registered, body-cap rejection, schema-validation rejection, happy-path with mocked dependency. - **S0.5** — Load-bearing internal packages have unit-test coverage proportional to their blast radius. - Concrete: `internal/shared/{server,config}_test.go` exist (R-002). `internal/storeclient/client_test.go` exists (R-003). `internal/queryd/db_test.go` exists with adversarial `sqlEscape` + exhaustive `redactCreds` cases (R-008). - **S0.6** — Empty `tests/` directory either claimed or removed. - Concrete: pick. If claimed for fixture-mode wiring (S0.2), document its purpose in README. If not, delete in the same commit as S0.1. ### Acceptance - `just --list` shows `verify`, `smoke-fixtures`, `doctor`, plus shortcuts for `fmt`/`vet`/`test`/`smoke `. - `just verify` exits 0 on a clean clone with deps present. - `just smoke-fixtures` exits 0 on a clean clone with **no MinIO and no Ollama**. - Pre-push hook present at `.git/hooks/pre-push`, executable, calls `just verify`. - `go test ./...` shows non-empty test count for every package in `internal/` (no more `[no test files]` lines for shared/storeclient). - Test count for cmd/ binaries: 7/7 (today: 1/7). - Failure output structured: any `just doctor` failure prints JSON describing what's missing, no claim of success. ### Estimate - S0.1 doctor: ~1 hr - S0.2 fixture-mode: ~3 hr (interface plumbing + fakes + new smokes) - S0.3 verify + hook: ~30 min - S0.4 cmd-level tests: ~3 hr (6 binaries × ~30 min) - S0.5 internal tests: ~3 hr - S0.6 tests/ dir: ~5 min Total: ~1.5 days focused. Single bundled PR with one commit per story. --- ## Sprint 1 — Trust Boundary Gate **Goal:** prevent agent trust collapse. Make the SQL surface not be RCE-equivalent on accidental non-localhost binding. Decide auth posture once and apply uniformly. **Risks closed:** R-001, R-007, R-009 (regression test only), R-010. ### Stories - **S1.1** — As an operator, I cannot accidentally expose `POST /sql` to the network. - Concrete: `cmd/queryd/main.go` startup asserts bind starts with `127.` or `[::1]`. If env `LH_QUERYD_ALLOW_NONLOOPBACK=1` is set, log a warning and continue. Otherwise `os.Exit(1)`. Same gate added to vectord, storaged, ingestd until S1.2 lands. - **S1.2** — As an operator, I have one configurable auth posture across all 7 binaries. - Concrete: ADR-003 picks Bearer-token + IP allow-list (or alternative — decide in the ADR). `internal/shared/auth.go` provides middleware; each `cmd//main.go` adds `r.Use(authMiddleware)` in one line. Token sourced from `secrets-go.toml`'s new `[auth].token` field. Empty token = local-mode (no auth, only `127.` bind allowed). - **S1.3** — As an operator, every public endpoint validates schema on input. - Concrete: each handler decoding a JSON body has explicit struct tags + missing-field detection. Unknown fields rejected (`json.Decoder.DisallowUnknownFields`). Empty-required-field rejected with structured 400. Today's coverage is partial; this story closes it uniformly. - **S1.4** — As a reviewer, I have a regression test against SQL injection in dataset names. - Concrete: `internal/queryd/registrar_test.go` gains a test where catalogd returns a manifest with `name: 'foo"; DROP TABLE x; --'`. The test asserts `quoteIdent` quoting prevents the DROP from executing — view name is `"foo""; DROP TABLE x; --"` which is a single quoted identifier (R-009 latent guard). ### Acceptance - All 7 binaries fail-loud on non-loopback bind without explicit override env. - ADR-003 in `docs/DECISIONS.md` documents the auth model with rationale. - Auth middleware is one `r.Use()` line per binary; adding it to a new binary takes one import. - Every JSON-decoding handler uses `DisallowUnknownFields` + missing-required-field rejection. - R-009 regression test passes; assertion would fail if `quoteIdent` is removed. ### Estimate ~2 days focused. ADR-003 is the gating decision; once written, S1.1 + S1.2 are mechanical. --- ## Sprint 2 — Memory Correctness Gate **Goal:** prove pathway / playbook memory cannot poison itself, with the test fixture covering Mem0 semantics on day one. This sprint is **design-bar work** for components that haven't been ported from Rust yet — the memory layer will not exist after Sprint 1. **Risks closed:** all DESIGN-BAR rows in `claim-coverage-table.md` for Mem0 + persistence-at-scale. ### Stories - **S2.1** — As an architect, I have an ADR fixing the pathway-memory data model in Go before code lands. - Concrete: ADR-004 documents trace shape, history-chain rules, retire semantics, replay-count rules. Cites the Rust `pathway_memory` crate as reference but does NOT carry forward the 88-trace state per ADR-001 (clean start ratified). - **S2.2** — As a developer, the pathway-memory port lands with a deterministic fixture corpus and full test coverage on day one. - Concrete: `tests/fixtures/pathway/` has known-shape JSON entries covering ADD / UPDATE / REVISE / RETIRE / HISTORY / cycle-attempt / replay-duplicate / corrupted-row. New `internal/pathway/` package implements the data model. Test count: ≥7 functions in `pathway_test.go`, one per fixture row. - **S2.3** — As a developer, retired traces are excluded from retrieval — and the test would fail without the exclusion. - Concrete: integration test does ADD → RETIRE → SEARCH → assert returned set excludes the retired UID. Removing the retirement filter must turn this test red. - **S2.4** — As an architect, vectord persistence works above 256 MiB single-key (the gap from the 500K staffing test). - Concrete: either bump storaged's `MaxBytesReader` for vector-content paths, or split LHV1 across N fixed-size keys with a manifest pointer, or add multipart upload to storaged. Decision in ADR-005. Smoke variant `g1p_scale_smoke.sh` ingests 200K vectors @ d=768 + asserts kill-restart preserves state at that size. ### Acceptance - ADR-004 and ADR-005 in `docs/DECISIONS.md`. - `internal/pathway/` package with ≥7 covering tests; `go test ./internal/pathway/` passes. - Retire-exclusion regression test passes; would fail if filter logic removed. - `g1p_scale_smoke.sh` passes at 200K vectors. ### Estimate ~1 week. ADR-004 is the design anchor; the test fixtures derive from it. --- ## Sprint 3 — Agent Loop Reality Gate **Goal:** prove the full agent loop works across an actual workflow. End-to-end deterministic: search → verify → observer review → playbook seal → second-run retrieval surfaces the prior playbook. **Risks closed:** all DESIGN-BAR rows for observer + playbook seal + agent loop closure. The Rust system's `r.json()` on text/plain crash-loop bug (memory `54689d5`) gets a regression test. ### Stories - **S3.1** — As an architect, ADR-002 fixes observer fail-safe semantics before observer is ported. - Concrete: doc-only. Default verdict = `cycle`, `degraded: true` on internal error. Explicit `LH_OBSERVER_FAIL_OPEN=1` env to opt into fail-open in dev only. Reference the Rust mcp-server's `verdict: "accept"` on observer error as the anti-pattern being designed away. - **S3.2** — As a developer, the observer port ships with tests covering the four states (accept / reject / cycle / degraded). - Concrete: `internal/observer/` package + `cmd/observerd` binary. Test fixture: hallucinated claim → reject; valid claim with SQL truth → accept; SQL truth unreachable → degraded+cycle (NEVER accept). - **S3.3** — As a developer, playbook seal + second-run retrieval is a single end-to-end smoke. - Concrete: `agent_loop_smoke.sh` does ingest → search → verify → observer review → seal → second-run retrieval. Assertions: second run surfaces prior playbook UID; report includes input hash, output hash, verdict, and memory-mutation receipt. - **S3.4** — As a reviewer, the Rust health-endpoint content-type bug cannot recur. - Concrete: regression test that consumes `/health` from each of the 7 binaries via the gateway and asserts: response is text/plain, body matches ` ok` pattern, never silently parses as JSON. ### Acceptance - ADR-002 in `docs/DECISIONS.md`. - `internal/observer/` with ≥4 covering tests. - `agent_loop_smoke.sh` passes deterministically; tagged report includes input/output hashes + verdict + receipt. - `health_contenttype_test.go` exists, would fail if any binary regresses to JSON. ### Estimate ~1 week. ADR-002 is short; observer port is the bulk; agent-loop wiring is real engineering. --- ## Sprint 4 — Deployment Gate **Goal:** turn deployment from tribal-knowledge into executable validation. Fresh box → green smoke chain in one command. **Risks closed:** R-006 (cloud-only Provider), all deployment-readiness gaps (no REPLICATION, no env template, no systemd, no doctor). ### Stories - **S4.1** — As an operator on a fresh Debian box, `just doctor` tells me exactly what to install. - Concrete: structured JSON output describing each missing dep with the `apt install` / `curl ... | tar` command to fix it. Cross-checked against `README.md` "Cold-start dependencies" — single source of truth. - **S4.2** — As an operator, `REPLICATION.md` is executable, not narrative. - Concrete: every step in `REPLICATION.md` is either a copy-pasteable command block or a reference to a `just ` invocation. Validation steps from the upstream `REPLICATION.md` (health checks, embed probe, vector probe, agent test) become `just smoke-replication`. - **S4.3** — As an operator, I have an env template for `secrets-go.toml`. - Concrete: `secrets-go.toml.example` in repo with all required keys + comments documenting each. `just doctor` checks for unfilled placeholder values. - **S4.4** — As an operator, systemd units in repo wire each binary cleanly. - Concrete: `deploy/systemd/{gateway,storaged,catalogd,ingestd,queryd,vectord,embedd}.service` with `After=`, `Restart=on-failure`, `MemoryMax=`, environment loading. `just install-systemd` symlinks them. - **S4.5** — As an operator deploying to AWS S3 instead of MinIO, no code changes are required. - Concrete: `just smoke-aws-s3` variant that points the bucket config at real S3. Existing smokes pass against real S3 (validates the aws-sdk-go-v2 path). ### Acceptance - `just doctor` on fresh Debian 13 box reports actionable JSON with install commands. - `just smoke-replication` succeeds on first run after `just doctor` shows green. - `secrets-go.toml.example` present with documented keys. - 7 systemd unit files in `deploy/systemd/`; `systemctl status lakehouse-go-*` shows green after install. - `just smoke-aws-s3` succeeds against a real bucket (manual: requires AWS creds). ### Estimate ~3 days focused. S4.4 + S4.5 are most of the time. --- ## Cross-sprint dependencies ``` Sprint 0 ─────────────────────────────────────► (unblocks all) │ ├─► Sprint 1 ───► Sprint 2 ───► Sprint 3 ───► Sprint 4 │ │ │ │ │ ▼ ▼ ▼ └──── auth ADR ── memory ADR ── observer ADR ``` - Sprint 0 is the gate. None of the others should ship without `just verify` reliably catching regressions. - Sprint 1 should land before Sprint 2 because R-001 (queryd /sql) is HIGH severity and the fix is mostly mechanical. - Sprint 2 / 3 are real engineering; estimates are floors not ceilings. - Sprint 4 can land in parallel with Sprint 2/3 — its stories don't depend on the agent-loop port.