Adapts docs/SCRUM.md framework (originally written for the matrix-agent-validated repo) to the Go rewrite. Five deliverables: golang-lakehouse-scrum-test.md top-line + scoring + verdict risk-register.md 12 findings, R-001..R-012 claim-coverage-table.md claim/test/risk for Sprint 2 sprint-backlog.md 5 sprints, ~2 weeks of work acceptance-gates.md DoD as runnable commands Every claim cites file:line, command output, or "missing evidence." Smoke chain ran clean (33s wall, all 9 PASS) and is captured in reports/scrum/_evidence/smoke_chain.log (gitignored — runtime artifact). Scoring: Reproducibility 7/10 9 smokes deterministic, no just/CI gate Test Coverage 6/10 internal/ packages tested, 6/7 cmd/ aren't Trust Boundary 7/10 escapes ok, zero auth, /sql is RCE-eq off-loopback Memory Correctness 3/10 pathway/playbook/observer not yet ported Deployment Readiness 4/10 no REPLICATION, no env template, no systemd Maintainability 8/10 no god-files, 7 lean binaries, ADRs current Top three risks: R-001 HIGH queryd /sql + DuckDB + non-loopback bind = RCE-equivalent R-002 HIGH internal/shared (server.go + config.go) zero tests R-003 HIGH internal/storeclient zero tests, used by 2 services R-004 MED 9-smoke chain green but not gated (no justfile/hook) The audit is the work; refactors come after. Sprint 0 owns coverage + CI gating; Sprint 1 owns trust-boundary decisions; Sprints 2-3 are mostly design-bar work for unbuilt agent components. .gitignore exception: /reports/* + !/reports/scrum/ keeps reports/ a runtime-artifact directory while exposing reports/scrum/ as tracked documentation. Mirrors the pattern future audit passes will land in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9.2 KiB
9.2 KiB
golangLAKEHOUSE — Acceptance Gates
Definition-of-done for each sprint, expressed as concrete commands a reviewer can run. Every gate is a binary pass/fail; no judgment calls. Sprint backlog (sprint-backlog.md) describes the work; this doc describes the proof of completion.
Format convention
Each gate is:
GATE-<sprint>.<n>: <one-line claim>
$ <command to run>
expected: <observable result>
fails if: <regression condition>
A sprint is "done" when every gate for that sprint passes on a clean clone. CI / pre-push automation should embed these gates so completion is mechanical.
Sprint 0 — Reproducibility Gate
GATE-0.1: just runner is the canonical entry point
$ just --list
expected: includes `verify`, `smoke-fixtures`, `doctor`, `fmt`, `vet`, `test`, `smoke <day>`
fails if: `just` not found or any of the above targets missing
GATE-0.2: deps probe surfaces missing dependencies as structured JSON
$ just doctor --json
expected: exit 0 if all deps present; exit 1 with JSON listing missing deps if not
fails if: any false-positive (claims dep missing when present) or false-negative (claims OK when missing)
GATE-0.3: full chain runs without external services
$ just smoke-fixtures
expected: exit 0; uses MockS3Storage + MockEmbedProvider; no MinIO/Ollama dependency
fails if: smoke-fixtures invokes anything on localhost:9000 or localhost:11434
GATE-0.4: full chain runs against real services
$ just verify
expected: exit 0; runs go vet + go test + the 9 smokes; total wall ≤ 60s on this box
fails if: any individual smoke fails or wall > 90s without a flake annotation
GATE-0.5: pre-push hook blocks regressions
$ git push (after introducing a regression)
expected: hook runs `just verify`, push aborts on non-zero exit
fails if: hook missing, hook does not exit non-zero on test failure, or push proceeds despite failure
GATE-0.6: every internal/ package has at least one test
$ go test ./internal/... 2>&1 | grep "no test files"
expected: empty (no packages without tests)
fails if: `internal/shared` or `internal/storeclient` show as "no test files"
GATE-0.7: every cmd/ binary has at least one test
$ go test ./cmd/... 2>&1 | grep "no test files"
expected: empty (no binaries without tests)
fails if: any cmd/<bin>/main_test.go absent
GATE-0.8: queryd db.go has unit coverage on sqlEscape + redactCreds
$ go test -run "TestSqlEscape|TestRedactCreds" ./internal/queryd/
expected: at least one passing test for each function
fails if: zero matching tests (today's state)
Sprint 1 — Trust Boundary Gate
GATE-1.1: queryd refuses to start on non-loopback bind without explicit override
$ LH_QUERYD_BIND=0.0.0.0:3214 ./bin/queryd
expected: exits 1 within 1s; stderr cites the assertion
fails if: binary starts and accepts connections on 0.0.0.0
GATE-1.2: same gate applies to storaged, ingestd, vectord
$ for b in storaged ingestd vectord; do LH_${b^^}_BIND=0.0.0.0:99$N ./bin/$b; done
expected: each exits 1 with cited assertion
fails if: any binary binds non-loopback silently
GATE-1.3: ADR-003 documents the auth posture
$ test -f docs/DECISIONS.md && grep -q "ADR-003" docs/DECISIONS.md
expected: ADR-003 section exists with title + status + rationale
fails if: ADR-003 absent or marked Draft after sprint close
GATE-1.4: auth middleware applies uniformly when token configured
$ TOKEN=bad curl -H "Authorization: Bearer $TOKEN" http://127.0.0.1:3110/v1/sql
expected: 401
$ TOKEN=valid curl -H "Authorization: Bearer $TOKEN" http://127.0.0.1:3110/v1/sql
expected: 200 (or 4xx for malformed body, never 401)
fails if: any binary accepts requests without the configured token
GATE-1.5: every JSON handler rejects unknown fields
$ curl -X POST http://127.0.0.1:3110/v1/sql -d '{"sql":"SELECT 1","mystery_field":true}'
expected: 400 with body citing unknown field
fails if: 200 (silent drop) or 500 (unexpected)
GATE-1.6: SQL injection regression test passes
$ go test -run "TestRegistrar_QuotesAdversarialName" ./internal/queryd/
expected: pass
fails if: test absent or fails — meaning quoteIdent regression is undetected
Sprint 2 — Memory Correctness Gate
GATE-2.1: ADR-004 documents the pathway-memory data model
$ grep -q "ADR-004" docs/DECISIONS.md
expected: ADR-004 section exists with trace shape, history rules, retire semantics
fails if: absent
GATE-2.2: pathway package has full Mem0-shape coverage
$ go test ./internal/pathway/ -count=1
expected: all 7+ tests pass: TestAdd, TestUpdate, TestRevise, TestRetire, TestHistory, TestCycleSafe, TestReplayCount, TestCorruptedRow
fails if: any of those test names absent
GATE-2.3: retired traces are excluded from retrieval
$ go test -run TestRetire_ExcludedFromSearch ./internal/pathway/
expected: pass
$ git revert HEAD --no-commit; (delete the filter); go test -run TestRetire_ExcludedFromSearch
expected: fail (proves the test is load-bearing, not vacuous)
fails if: removing the filter doesn't make the test fail
GATE-2.4: vectord persistence works at scale (200K vectors @ d=768)
$ ./scripts/g1p_scale_smoke.sh
expected: exit 0; ingests 200K vectors, kills vectord, restarts, search returns dist≤1e-7
fails if: any operation hits storaged 256 MiB cap or returns > tolerance distance
GATE-2.5: ADR-005 ratifies the storaged-cap fix path
$ grep -q "ADR-005" docs/DECISIONS.md
expected: ADR-005 documents B (split LHV1) vs C (multipart in storaged) decision
fails if: absent
Sprint 3 — Agent Loop Reality Gate
GATE-3.1: ADR-002 defines observer fail-safe semantics
$ grep -q "ADR-002" docs/DECISIONS.md
expected: ADR-002 section: degraded-by-default on error, explicit env to opt into fail-open
fails if: absent
GATE-3.2: observer rejects hallucinated claim
$ go test -run TestObserver_HallucinatedClaim_Rejected ./internal/observer/
expected: pass
fails if: hallucinated-claim path returns accept
GATE-3.3: observer never auto-accepts on internal error
$ go test -run TestObserver_InternalError_DegradedCycle ./internal/observer/
expected: pass; response is {verdict: "cycle", degraded: true}
fails if: any error path can produce {verdict: "accept"}
GATE-3.4: end-to-end agent loop deterministic
$ ./scripts/agent_loop_smoke.sh
expected: exit 0; report file at /tmp/agent_loop_<sha>.json contains input_hash, output_hash, verdict, memory_receipt
fails if: report missing any field or hashes don't match expected fixture
GATE-3.5: second-run retrieval surfaces prior playbook
$ go test -run TestSecondRun_SurfacesPriorPlaybook ./internal/agent/
expected: pass
fails if: second run does not return the UID seen in first run
GATE-3.6: health endpoint content-type regression test
$ go test ./internal/shared/ -run TestHealth_ContentType
expected: pass; consumer pattern that called .json() on text/plain returns 502 loudly
fails if: any /health consumer can silently null on type confusion
Sprint 4 — Deployment Gate
GATE-4.1: fresh-Debian doctor surfaces install commands
$ docker run --rm -v $PWD:/repo debian:13 bash -c "cd /repo && just doctor"
expected: structured JSON with apt install / curl tarball commands per missing dep; exit 1
fails if: silent claim of OK or vague "missing dep" without fix command
GATE-4.2: REPLICATION.md is executable
$ awk '/^```bash$/,/^```$/' REPLICATION.md | grep -v '^```' | bash
expected: every code block runs (may require deps; failure must be expected from doctor)
fails if: REPLICATION contains pseudo-commands or hardcoded paths that don't match repo
GATE-4.3: env template covers every required key
$ test -f secrets-go.toml.example && grep -q "access_key_id" secrets-go.toml.example
expected: example file with documented keys; just doctor warns on placeholder values
fails if: example absent or doesn't surface placeholder detection
GATE-4.4: systemd units present and correct
$ ls deploy/systemd/*.service | wc -l
expected: 7 files (one per binary)
$ systemd-analyze verify deploy/systemd/*.service
expected: exit 0
fails if: any unit fails verify or has missing fields (After, Restart, MemoryMax)
GATE-4.5: AWS S3 path works without code changes
$ AWS_PROFILE=test ./scripts/d2_smoke_aws.sh
expected: exit 0 against a real S3 bucket
fails if: any code path assumes MinIO-specific behavior
Cross-sprint compound gate
GATE-FINAL: full clean-clone reproducibility
$ rm -rf /tmp/golangLAKEHOUSE-test
$ git clone <url> /tmp/golangLAKEHOUSE-test
$ cd /tmp/golangLAKEHOUSE-test
$ just doctor || (read fix instructions, run them, rerun)
$ just verify
expected: green within 60s wall of `just verify` (excluding doctor remediation)
fails if: any step requires undocumented manual intervention
This is the SCRUM.md Sprint 0 ultimate test: "fresh clone can run just doctor; missing
env vars are reported clearly; no absolute path assumptions remain unless configured."
How a future audit verifies these gates
Re-run this audit's commands plus the new gates. Compare scores against golang-lakehouse-scrum-test.md baseline (35/60). A net improvement is the proof the sprints landed; a flat or declining score is signal that the gates were checked-the-box, not internalized.