Adapts docs/SCRUM.md framework (originally written for the matrix-agent-validated repo) to the Go rewrite. Five deliverables: golang-lakehouse-scrum-test.md top-line + scoring + verdict risk-register.md 12 findings, R-001..R-012 claim-coverage-table.md claim/test/risk for Sprint 2 sprint-backlog.md 5 sprints, ~2 weeks of work acceptance-gates.md DoD as runnable commands Every claim cites file:line, command output, or "missing evidence." Smoke chain ran clean (33s wall, all 9 PASS) and is captured in reports/scrum/_evidence/smoke_chain.log (gitignored — runtime artifact). Scoring: Reproducibility 7/10 9 smokes deterministic, no just/CI gate Test Coverage 6/10 internal/ packages tested, 6/7 cmd/ aren't Trust Boundary 7/10 escapes ok, zero auth, /sql is RCE-eq off-loopback Memory Correctness 3/10 pathway/playbook/observer not yet ported Deployment Readiness 4/10 no REPLICATION, no env template, no systemd Maintainability 8/10 no god-files, 7 lean binaries, ADRs current Top three risks: R-001 HIGH queryd /sql + DuckDB + non-loopback bind = RCE-equivalent R-002 HIGH internal/shared (server.go + config.go) zero tests R-003 HIGH internal/storeclient zero tests, used by 2 services R-004 MED 9-smoke chain green but not gated (no justfile/hook) The audit is the work; refactors come after. Sprint 0 owns coverage + CI gating; Sprint 1 owns trust-boundary decisions; Sprints 2-3 are mostly design-bar work for unbuilt agent components. .gitignore exception: /reports/* + !/reports/scrum/ keeps reports/ a runtime-artifact directory while exposing reports/scrum/ as tracked documentation. Mirrors the pattern future audit passes will land in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
210 lines
13 KiB
Markdown
210 lines
13 KiB
Markdown
# golangLAKEHOUSE — Sprint Backlog
|
||
|
||
Five sprints adapted from SCRUM.md's framework. Each sprint has a goal, user stories, and acceptance criteria. Risk IDs reference `risk-register.md`. Acceptance-of-done details live in `acceptance-gates.md`.
|
||
|
||
The audit is the work of *this* turn; these sprints are the next turns. Order matters — Sprint 0 unblocks the rest by making the substrate provably runnable on a clean box.
|
||
|
||
---
|
||
|
||
## Sprint 0 — Reproducibility Gate
|
||
|
||
**Goal:** make the repo provably runnable, with structural protection against silent regressions in the load-bearing-but-untested layers.
|
||
|
||
**Risks closed:** R-002, R-003, R-004, R-005, R-006, R-008, R-012.
|
||
|
||
### Stories
|
||
|
||
- **S0.1** — As an operator, I can run **one command** and know exactly which dependencies are missing or wrong-versioned.
|
||
- Concrete: `just doctor` checks Go ≥1.25, gcc, MinIO at `:9000`, Ollama at `:11434` with `nomic-embed-text` loaded, `secrets-go.toml` present + readable. Output is structured JSON on `--json` flag. Non-zero exit on any missing dep.
|
||
|
||
- **S0.2** — As an operator, I can run a **minimal fixture test** without MinIO or Ollama.
|
||
- Concrete: `just smoke-fixtures` runs against in-process fakes (`MockS3Storage` + `MockEmbedProvider`). Smokes split into two tiers: `*_smoke.sh` (real services, slow) vs `*_smoke_fixtures.sh` (fakes, runs anywhere).
|
||
|
||
- **S0.3** — As an operator, I can verify the whole substrate with one command, and I cannot push a regression past it.
|
||
- Concrete: `just verify` runs `go vet` + `go test` + the 9-smoke chain. `.git/hooks/pre-push` calls `just verify` and aborts on non-zero exit. Failure output is structured.
|
||
|
||
- **S0.4** — As a reviewer, I can read coverage at a glance and see where wiring layers lack tests.
|
||
- Concrete: `cmd/<bin>/main_test.go` exists for all 7 binaries (today: only `storaged`). Each tests routes registered, body-cap rejection, schema-validation rejection, happy-path with mocked dependency.
|
||
|
||
- **S0.5** — Load-bearing internal packages have unit-test coverage proportional to their blast radius.
|
||
- Concrete: `internal/shared/{server,config}_test.go` exist (R-002). `internal/storeclient/client_test.go` exists (R-003). `internal/queryd/db_test.go` exists with adversarial `sqlEscape` + exhaustive `redactCreds` cases (R-008).
|
||
|
||
- **S0.6** — Empty `tests/` directory either claimed or removed.
|
||
- Concrete: pick. If claimed for fixture-mode wiring (S0.2), document its purpose in README. If not, delete in the same commit as S0.1.
|
||
|
||
### Acceptance
|
||
|
||
- `just --list` shows `verify`, `smoke-fixtures`, `doctor`, plus shortcuts for `fmt`/`vet`/`test`/`smoke <day>`.
|
||
- `just verify` exits 0 on a clean clone with deps present.
|
||
- `just smoke-fixtures` exits 0 on a clean clone with **no MinIO and no Ollama**.
|
||
- Pre-push hook present at `.git/hooks/pre-push`, executable, calls `just verify`.
|
||
- `go test ./...` shows non-empty test count for every package in `internal/` (no more `[no test files]` lines for shared/storeclient).
|
||
- Test count for cmd/ binaries: 7/7 (today: 1/7).
|
||
- Failure output structured: any `just doctor` failure prints JSON describing what's missing, no claim of success.
|
||
|
||
### Estimate
|
||
|
||
- S0.1 doctor: ~1 hr
|
||
- S0.2 fixture-mode: ~3 hr (interface plumbing + fakes + new smokes)
|
||
- S0.3 verify + hook: ~30 min
|
||
- S0.4 cmd-level tests: ~3 hr (6 binaries × ~30 min)
|
||
- S0.5 internal tests: ~3 hr
|
||
- S0.6 tests/ dir: ~5 min
|
||
|
||
Total: ~1.5 days focused. Single bundled PR with one commit per story.
|
||
|
||
---
|
||
|
||
## Sprint 1 — Trust Boundary Gate
|
||
|
||
**Goal:** prevent agent trust collapse. Make the SQL surface not be RCE-equivalent on accidental non-localhost binding. Decide auth posture once and apply uniformly.
|
||
|
||
**Risks closed:** R-001, R-007, R-009 (regression test only), R-010.
|
||
|
||
### Stories
|
||
|
||
- **S1.1** — As an operator, I cannot accidentally expose `POST /sql` to the network.
|
||
- Concrete: `cmd/queryd/main.go` startup asserts bind starts with `127.` or `[::1]`. If env `LH_QUERYD_ALLOW_NONLOOPBACK=1` is set, log a warning and continue. Otherwise `os.Exit(1)`. Same gate added to vectord, storaged, ingestd until S1.2 lands.
|
||
|
||
- **S1.2** — As an operator, I have one configurable auth posture across all 7 binaries.
|
||
- Concrete: ADR-003 picks Bearer-token + IP allow-list (or alternative — decide in the ADR). `internal/shared/auth.go` provides middleware; each `cmd/<bin>/main.go` adds `r.Use(authMiddleware)` in one line. Token sourced from `secrets-go.toml`'s new `[auth].token` field. Empty token = local-mode (no auth, only `127.` bind allowed).
|
||
|
||
- **S1.3** — As an operator, every public endpoint validates schema on input.
|
||
- Concrete: each handler decoding a JSON body has explicit struct tags + missing-field detection. Unknown fields rejected (`json.Decoder.DisallowUnknownFields`). Empty-required-field rejected with structured 400. Today's coverage is partial; this story closes it uniformly.
|
||
|
||
- **S1.4** — As a reviewer, I have a regression test against SQL injection in dataset names.
|
||
- Concrete: `internal/queryd/registrar_test.go` gains a test where catalogd returns a manifest with `name: 'foo"; DROP TABLE x; --'`. The test asserts `quoteIdent` quoting prevents the DROP from executing — view name is `"foo""; DROP TABLE x; --"` which is a single quoted identifier (R-009 latent guard).
|
||
|
||
### Acceptance
|
||
|
||
- All 7 binaries fail-loud on non-loopback bind without explicit override env.
|
||
- ADR-003 in `docs/DECISIONS.md` documents the auth model with rationale.
|
||
- Auth middleware is one `r.Use()` line per binary; adding it to a new binary takes one import.
|
||
- Every JSON-decoding handler uses `DisallowUnknownFields` + missing-required-field rejection.
|
||
- R-009 regression test passes; assertion would fail if `quoteIdent` is removed.
|
||
|
||
### Estimate
|
||
|
||
~2 days focused. ADR-003 is the gating decision; once written, S1.1 + S1.2 are mechanical.
|
||
|
||
---
|
||
|
||
## Sprint 2 — Memory Correctness Gate
|
||
|
||
**Goal:** prove pathway / playbook memory cannot poison itself, with the test fixture covering Mem0 semantics on day one. This sprint is **design-bar work** for components that haven't been ported from Rust yet — the memory layer will not exist after Sprint 1.
|
||
|
||
**Risks closed:** all DESIGN-BAR rows in `claim-coverage-table.md` for Mem0 + persistence-at-scale.
|
||
|
||
### Stories
|
||
|
||
- **S2.1** — As an architect, I have an ADR fixing the pathway-memory data model in Go before code lands.
|
||
- Concrete: ADR-004 documents trace shape, history-chain rules, retire semantics, replay-count rules. Cites the Rust `pathway_memory` crate as reference but does NOT carry forward the 88-trace state per ADR-001 (clean start ratified).
|
||
|
||
- **S2.2** — As a developer, the pathway-memory port lands with a deterministic fixture corpus and full test coverage on day one.
|
||
- Concrete: `tests/fixtures/pathway/` has known-shape JSON entries covering ADD / UPDATE / REVISE / RETIRE / HISTORY / cycle-attempt / replay-duplicate / corrupted-row. New `internal/pathway/` package implements the data model. Test count: ≥7 functions in `pathway_test.go`, one per fixture row.
|
||
|
||
- **S2.3** — As a developer, retired traces are excluded from retrieval — and the test would fail without the exclusion.
|
||
- Concrete: integration test does ADD → RETIRE → SEARCH → assert returned set excludes the retired UID. Removing the retirement filter must turn this test red.
|
||
|
||
- **S2.4** — As an architect, vectord persistence works above 256 MiB single-key (the gap from the 500K staffing test).
|
||
- Concrete: either bump storaged's `MaxBytesReader` for vector-content paths, or split LHV1 across N fixed-size keys with a manifest pointer, or add multipart upload to storaged. Decision in ADR-005. Smoke variant `g1p_scale_smoke.sh` ingests 200K vectors @ d=768 + asserts kill-restart preserves state at that size.
|
||
|
||
### Acceptance
|
||
|
||
- ADR-004 and ADR-005 in `docs/DECISIONS.md`.
|
||
- `internal/pathway/` package with ≥7 covering tests; `go test ./internal/pathway/` passes.
|
||
- Retire-exclusion regression test passes; would fail if filter logic removed.
|
||
- `g1p_scale_smoke.sh` passes at 200K vectors.
|
||
|
||
### Estimate
|
||
|
||
~1 week. ADR-004 is the design anchor; the test fixtures derive from it.
|
||
|
||
---
|
||
|
||
## Sprint 3 — Agent Loop Reality Gate
|
||
|
||
**Goal:** prove the full agent loop works across an actual workflow. End-to-end deterministic: search → verify → observer review → playbook seal → second-run retrieval surfaces the prior playbook.
|
||
|
||
**Risks closed:** all DESIGN-BAR rows for observer + playbook seal + agent loop closure. The Rust system's `r.json()` on text/plain crash-loop bug (memory `54689d5`) gets a regression test.
|
||
|
||
### Stories
|
||
|
||
- **S3.1** — As an architect, ADR-002 fixes observer fail-safe semantics before observer is ported.
|
||
- Concrete: doc-only. Default verdict = `cycle`, `degraded: true` on internal error. Explicit `LH_OBSERVER_FAIL_OPEN=1` env to opt into fail-open in dev only. Reference the Rust mcp-server's `verdict: "accept"` on observer error as the anti-pattern being designed away.
|
||
|
||
- **S3.2** — As a developer, the observer port ships with tests covering the four states (accept / reject / cycle / degraded).
|
||
- Concrete: `internal/observer/` package + `cmd/observerd` binary. Test fixture: hallucinated claim → reject; valid claim with SQL truth → accept; SQL truth unreachable → degraded+cycle (NEVER accept).
|
||
|
||
- **S3.3** — As a developer, playbook seal + second-run retrieval is a single end-to-end smoke.
|
||
- Concrete: `agent_loop_smoke.sh` does ingest → search → verify → observer review → seal → second-run retrieval. Assertions: second run surfaces prior playbook UID; report includes input hash, output hash, verdict, and memory-mutation receipt.
|
||
|
||
- **S3.4** — As a reviewer, the Rust health-endpoint content-type bug cannot recur.
|
||
- Concrete: regression test that consumes `/health` from each of the 7 binaries via the gateway and asserts: response is text/plain, body matches `<service> ok` pattern, never silently parses as JSON.
|
||
|
||
### Acceptance
|
||
|
||
- ADR-002 in `docs/DECISIONS.md`.
|
||
- `internal/observer/` with ≥4 covering tests.
|
||
- `agent_loop_smoke.sh` passes deterministically; tagged report includes input/output hashes + verdict + receipt.
|
||
- `health_contenttype_test.go` exists, would fail if any binary regresses to JSON.
|
||
|
||
### Estimate
|
||
|
||
~1 week. ADR-002 is short; observer port is the bulk; agent-loop wiring is real engineering.
|
||
|
||
---
|
||
|
||
## Sprint 4 — Deployment Gate
|
||
|
||
**Goal:** turn deployment from tribal-knowledge into executable validation. Fresh box → green smoke chain in one command.
|
||
|
||
**Risks closed:** R-006 (cloud-only Provider), all deployment-readiness gaps (no REPLICATION, no env template, no systemd, no doctor).
|
||
|
||
### Stories
|
||
|
||
- **S4.1** — As an operator on a fresh Debian box, `just doctor` tells me exactly what to install.
|
||
- Concrete: structured JSON output describing each missing dep with the `apt install` / `curl ... | tar` command to fix it. Cross-checked against `README.md` "Cold-start dependencies" — single source of truth.
|
||
|
||
- **S4.2** — As an operator, `REPLICATION.md` is executable, not narrative.
|
||
- Concrete: every step in `REPLICATION.md` is either a copy-pasteable command block or a reference to a `just <target>` invocation. Validation steps from the upstream `REPLICATION.md` (health checks, embed probe, vector probe, agent test) become `just smoke-replication`.
|
||
|
||
- **S4.3** — As an operator, I have an env template for `secrets-go.toml`.
|
||
- Concrete: `secrets-go.toml.example` in repo with all required keys + comments documenting each. `just doctor` checks for unfilled placeholder values.
|
||
|
||
- **S4.4** — As an operator, systemd units in repo wire each binary cleanly.
|
||
- Concrete: `deploy/systemd/{gateway,storaged,catalogd,ingestd,queryd,vectord,embedd}.service` with `After=`, `Restart=on-failure`, `MemoryMax=`, environment loading. `just install-systemd` symlinks them.
|
||
|
||
- **S4.5** — As an operator deploying to AWS S3 instead of MinIO, no code changes are required.
|
||
- Concrete: `just smoke-aws-s3` variant that points the bucket config at real S3. Existing smokes pass against real S3 (validates the aws-sdk-go-v2 path).
|
||
|
||
### Acceptance
|
||
|
||
- `just doctor` on fresh Debian 13 box reports actionable JSON with install commands.
|
||
- `just smoke-replication` succeeds on first run after `just doctor` shows green.
|
||
- `secrets-go.toml.example` present with documented keys.
|
||
- 7 systemd unit files in `deploy/systemd/`; `systemctl status lakehouse-go-*` shows green after install.
|
||
- `just smoke-aws-s3` succeeds against a real bucket (manual: requires AWS creds).
|
||
|
||
### Estimate
|
||
|
||
~3 days focused. S4.4 + S4.5 are most of the time.
|
||
|
||
---
|
||
|
||
## Cross-sprint dependencies
|
||
|
||
```
|
||
Sprint 0 ─────────────────────────────────────► (unblocks all)
|
||
│
|
||
├─► Sprint 1 ───► Sprint 2 ───► Sprint 3 ───► Sprint 4
|
||
│ │ │ │
|
||
│ ▼ ▼ ▼
|
||
└──── auth ADR ── memory ADR ── observer ADR
|
||
```
|
||
|
||
- Sprint 0 is the gate. None of the others should ship without `just verify` reliably catching regressions.
|
||
- Sprint 1 should land before Sprint 2 because R-001 (queryd /sql) is HIGH severity and the fix is mostly mechanical.
|
||
- Sprint 2 / 3 are real engineering; estimates are floors not ceilings.
|
||
- Sprint 4 can land in parallel with Sprint 2/3 — its stories don't depend on the agent-loop port.
|