Adapts docs/SCRUM.md framework (originally written for the matrix-agent-validated repo) to the Go rewrite. Five deliverables: golang-lakehouse-scrum-test.md top-line + scoring + verdict risk-register.md 12 findings, R-001..R-012 claim-coverage-table.md claim/test/risk for Sprint 2 sprint-backlog.md 5 sprints, ~2 weeks of work acceptance-gates.md DoD as runnable commands Every claim cites file:line, command output, or "missing evidence." Smoke chain ran clean (33s wall, all 9 PASS) and is captured in reports/scrum/_evidence/smoke_chain.log (gitignored — runtime artifact). Scoring: Reproducibility 7/10 9 smokes deterministic, no just/CI gate Test Coverage 6/10 internal/ packages tested, 6/7 cmd/ aren't Trust Boundary 7/10 escapes ok, zero auth, /sql is RCE-eq off-loopback Memory Correctness 3/10 pathway/playbook/observer not yet ported Deployment Readiness 4/10 no REPLICATION, no env template, no systemd Maintainability 8/10 no god-files, 7 lean binaries, ADRs current Top three risks: R-001 HIGH queryd /sql + DuckDB + non-loopback bind = RCE-equivalent R-002 HIGH internal/shared (server.go + config.go) zero tests R-003 HIGH internal/storeclient zero tests, used by 2 services R-004 MED 9-smoke chain green but not gated (no justfile/hook) The audit is the work; refactors come after. Sprint 0 owns coverage + CI gating; Sprint 1 owns trust-boundary decisions; Sprints 2-3 are mostly design-bar work for unbuilt agent components. .gitignore exception: /reports/* + !/reports/scrum/ keeps reports/ a runtime-artifact directory while exposing reports/scrum/ as tracked documentation. Mirrors the pattern future audit passes will land in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
229 lines
9.2 KiB
Markdown
229 lines
9.2 KiB
Markdown
# golangLAKEHOUSE — Acceptance Gates
|
|
|
|
Definition-of-done for each sprint, expressed as concrete commands a reviewer can run. Every gate is a binary pass/fail; no judgment calls. Sprint backlog (`sprint-backlog.md`) describes the work; this doc describes the proof of completion.
|
|
|
|
---
|
|
|
|
## Format convention
|
|
|
|
Each gate is:
|
|
|
|
```
|
|
GATE-<sprint>.<n>: <one-line claim>
|
|
$ <command to run>
|
|
expected: <observable result>
|
|
fails if: <regression condition>
|
|
```
|
|
|
|
A sprint is "done" when every gate for that sprint passes on a clean clone. CI / pre-push automation should embed these gates so completion is mechanical.
|
|
|
|
---
|
|
|
|
## Sprint 0 — Reproducibility Gate
|
|
|
|
```
|
|
GATE-0.1: just runner is the canonical entry point
|
|
$ just --list
|
|
expected: includes `verify`, `smoke-fixtures`, `doctor`, `fmt`, `vet`, `test`, `smoke <day>`
|
|
fails if: `just` not found or any of the above targets missing
|
|
|
|
GATE-0.2: deps probe surfaces missing dependencies as structured JSON
|
|
$ just doctor --json
|
|
expected: exit 0 if all deps present; exit 1 with JSON listing missing deps if not
|
|
fails if: any false-positive (claims dep missing when present) or false-negative (claims OK when missing)
|
|
|
|
GATE-0.3: full chain runs without external services
|
|
$ just smoke-fixtures
|
|
expected: exit 0; uses MockS3Storage + MockEmbedProvider; no MinIO/Ollama dependency
|
|
fails if: smoke-fixtures invokes anything on localhost:9000 or localhost:11434
|
|
|
|
GATE-0.4: full chain runs against real services
|
|
$ just verify
|
|
expected: exit 0; runs go vet + go test + the 9 smokes; total wall ≤ 60s on this box
|
|
fails if: any individual smoke fails or wall > 90s without a flake annotation
|
|
|
|
GATE-0.5: pre-push hook blocks regressions
|
|
$ git push (after introducing a regression)
|
|
expected: hook runs `just verify`, push aborts on non-zero exit
|
|
fails if: hook missing, hook does not exit non-zero on test failure, or push proceeds despite failure
|
|
|
|
GATE-0.6: every internal/ package has at least one test
|
|
$ go test ./internal/... 2>&1 | grep "no test files"
|
|
expected: empty (no packages without tests)
|
|
fails if: `internal/shared` or `internal/storeclient` show as "no test files"
|
|
|
|
GATE-0.7: every cmd/ binary has at least one test
|
|
$ go test ./cmd/... 2>&1 | grep "no test files"
|
|
expected: empty (no binaries without tests)
|
|
fails if: any cmd/<bin>/main_test.go absent
|
|
|
|
GATE-0.8: queryd db.go has unit coverage on sqlEscape + redactCreds
|
|
$ go test -run "TestSqlEscape|TestRedactCreds" ./internal/queryd/
|
|
expected: at least one passing test for each function
|
|
fails if: zero matching tests (today's state)
|
|
```
|
|
|
|
---
|
|
|
|
## Sprint 1 — Trust Boundary Gate
|
|
|
|
```
|
|
GATE-1.1: queryd refuses to start on non-loopback bind without explicit override
|
|
$ LH_QUERYD_BIND=0.0.0.0:3214 ./bin/queryd
|
|
expected: exits 1 within 1s; stderr cites the assertion
|
|
fails if: binary starts and accepts connections on 0.0.0.0
|
|
|
|
GATE-1.2: same gate applies to storaged, ingestd, vectord
|
|
$ for b in storaged ingestd vectord; do LH_${b^^}_BIND=0.0.0.0:99$N ./bin/$b; done
|
|
expected: each exits 1 with cited assertion
|
|
fails if: any binary binds non-loopback silently
|
|
|
|
GATE-1.3: ADR-003 documents the auth posture
|
|
$ test -f docs/DECISIONS.md && grep -q "ADR-003" docs/DECISIONS.md
|
|
expected: ADR-003 section exists with title + status + rationale
|
|
fails if: ADR-003 absent or marked Draft after sprint close
|
|
|
|
GATE-1.4: auth middleware applies uniformly when token configured
|
|
$ TOKEN=bad curl -H "Authorization: Bearer $TOKEN" http://127.0.0.1:3110/v1/sql
|
|
expected: 401
|
|
$ TOKEN=valid curl -H "Authorization: Bearer $TOKEN" http://127.0.0.1:3110/v1/sql
|
|
expected: 200 (or 4xx for malformed body, never 401)
|
|
fails if: any binary accepts requests without the configured token
|
|
|
|
GATE-1.5: every JSON handler rejects unknown fields
|
|
$ curl -X POST http://127.0.0.1:3110/v1/sql -d '{"sql":"SELECT 1","mystery_field":true}'
|
|
expected: 400 with body citing unknown field
|
|
fails if: 200 (silent drop) or 500 (unexpected)
|
|
|
|
GATE-1.6: SQL injection regression test passes
|
|
$ go test -run "TestRegistrar_QuotesAdversarialName" ./internal/queryd/
|
|
expected: pass
|
|
fails if: test absent or fails — meaning quoteIdent regression is undetected
|
|
```
|
|
|
|
---
|
|
|
|
## Sprint 2 — Memory Correctness Gate
|
|
|
|
```
|
|
GATE-2.1: ADR-004 documents the pathway-memory data model
|
|
$ grep -q "ADR-004" docs/DECISIONS.md
|
|
expected: ADR-004 section exists with trace shape, history rules, retire semantics
|
|
fails if: absent
|
|
|
|
GATE-2.2: pathway package has full Mem0-shape coverage
|
|
$ go test ./internal/pathway/ -count=1
|
|
expected: all 7+ tests pass: TestAdd, TestUpdate, TestRevise, TestRetire, TestHistory, TestCycleSafe, TestReplayCount, TestCorruptedRow
|
|
fails if: any of those test names absent
|
|
|
|
GATE-2.3: retired traces are excluded from retrieval
|
|
$ go test -run TestRetire_ExcludedFromSearch ./internal/pathway/
|
|
expected: pass
|
|
$ git revert HEAD --no-commit; (delete the filter); go test -run TestRetire_ExcludedFromSearch
|
|
expected: fail (proves the test is load-bearing, not vacuous)
|
|
fails if: removing the filter doesn't make the test fail
|
|
|
|
GATE-2.4: vectord persistence works at scale (200K vectors @ d=768)
|
|
$ ./scripts/g1p_scale_smoke.sh
|
|
expected: exit 0; ingests 200K vectors, kills vectord, restarts, search returns dist≤1e-7
|
|
fails if: any operation hits storaged 256 MiB cap or returns > tolerance distance
|
|
|
|
GATE-2.5: ADR-005 ratifies the storaged-cap fix path
|
|
$ grep -q "ADR-005" docs/DECISIONS.md
|
|
expected: ADR-005 documents B (split LHV1) vs C (multipart in storaged) decision
|
|
fails if: absent
|
|
```
|
|
|
|
---
|
|
|
|
## Sprint 3 — Agent Loop Reality Gate
|
|
|
|
```
|
|
GATE-3.1: ADR-002 defines observer fail-safe semantics
|
|
$ grep -q "ADR-002" docs/DECISIONS.md
|
|
expected: ADR-002 section: degraded-by-default on error, explicit env to opt into fail-open
|
|
fails if: absent
|
|
|
|
GATE-3.2: observer rejects hallucinated claim
|
|
$ go test -run TestObserver_HallucinatedClaim_Rejected ./internal/observer/
|
|
expected: pass
|
|
fails if: hallucinated-claim path returns accept
|
|
|
|
GATE-3.3: observer never auto-accepts on internal error
|
|
$ go test -run TestObserver_InternalError_DegradedCycle ./internal/observer/
|
|
expected: pass; response is {verdict: "cycle", degraded: true}
|
|
fails if: any error path can produce {verdict: "accept"}
|
|
|
|
GATE-3.4: end-to-end agent loop deterministic
|
|
$ ./scripts/agent_loop_smoke.sh
|
|
expected: exit 0; report file at /tmp/agent_loop_<sha>.json contains input_hash, output_hash, verdict, memory_receipt
|
|
fails if: report missing any field or hashes don't match expected fixture
|
|
|
|
GATE-3.5: second-run retrieval surfaces prior playbook
|
|
$ go test -run TestSecondRun_SurfacesPriorPlaybook ./internal/agent/
|
|
expected: pass
|
|
fails if: second run does not return the UID seen in first run
|
|
|
|
GATE-3.6: health endpoint content-type regression test
|
|
$ go test ./internal/shared/ -run TestHealth_ContentType
|
|
expected: pass; consumer pattern that called .json() on text/plain returns 502 loudly
|
|
fails if: any /health consumer can silently null on type confusion
|
|
```
|
|
|
|
---
|
|
|
|
## Sprint 4 — Deployment Gate
|
|
|
|
```
|
|
GATE-4.1: fresh-Debian doctor surfaces install commands
|
|
$ docker run --rm -v $PWD:/repo debian:13 bash -c "cd /repo && just doctor"
|
|
expected: structured JSON with apt install / curl tarball commands per missing dep; exit 1
|
|
fails if: silent claim of OK or vague "missing dep" without fix command
|
|
|
|
GATE-4.2: REPLICATION.md is executable
|
|
$ awk '/^```bash$/,/^```$/' REPLICATION.md | grep -v '^```' | bash
|
|
expected: every code block runs (may require deps; failure must be expected from doctor)
|
|
fails if: REPLICATION contains pseudo-commands or hardcoded paths that don't match repo
|
|
|
|
GATE-4.3: env template covers every required key
|
|
$ test -f secrets-go.toml.example && grep -q "access_key_id" secrets-go.toml.example
|
|
expected: example file with documented keys; just doctor warns on placeholder values
|
|
fails if: example absent or doesn't surface placeholder detection
|
|
|
|
GATE-4.4: systemd units present and correct
|
|
$ ls deploy/systemd/*.service | wc -l
|
|
expected: 7 files (one per binary)
|
|
$ systemd-analyze verify deploy/systemd/*.service
|
|
expected: exit 0
|
|
fails if: any unit fails verify or has missing fields (After, Restart, MemoryMax)
|
|
|
|
GATE-4.5: AWS S3 path works without code changes
|
|
$ AWS_PROFILE=test ./scripts/d2_smoke_aws.sh
|
|
expected: exit 0 against a real S3 bucket
|
|
fails if: any code path assumes MinIO-specific behavior
|
|
```
|
|
|
|
---
|
|
|
|
## Cross-sprint compound gate
|
|
|
|
```
|
|
GATE-FINAL: full clean-clone reproducibility
|
|
$ rm -rf /tmp/golangLAKEHOUSE-test
|
|
$ git clone <url> /tmp/golangLAKEHOUSE-test
|
|
$ cd /tmp/golangLAKEHOUSE-test
|
|
$ just doctor || (read fix instructions, run them, rerun)
|
|
$ just verify
|
|
expected: green within 60s wall of `just verify` (excluding doctor remediation)
|
|
fails if: any step requires undocumented manual intervention
|
|
|
|
This is the SCRUM.md Sprint 0 ultimate test: "fresh clone can run just doctor; missing
|
|
env vars are reported clearly; no absolute path assumptions remain unless configured."
|
|
```
|
|
|
|
---
|
|
|
|
## How a future audit verifies these gates
|
|
|
|
Re-run this audit's commands plus the new gates. Compare scores against `golang-lakehouse-scrum-test.md` baseline (35/60). A net improvement is the proof the sprints landed; a flat or declining score is signal that the gates were checked-the-box, not internalized.
|