root 91edd43164 scrum audit: 5 reports under reports/scrum/ · score 35/60

Adapts docs/SCRUM.md framework (originally written for the
matrix-agent-validated repo) to the Go rewrite. Five deliverables:

  golang-lakehouse-scrum-test.md  top-line + scoring + verdict
  risk-register.md                12 findings, R-001..R-012
  claim-coverage-table.md         claim/test/risk for Sprint 2
  sprint-backlog.md               5 sprints, ~2 weeks of work
  acceptance-gates.md             DoD as runnable commands

Every claim cites file:line, command output, or "missing evidence."
Smoke chain ran clean (33s wall, all 9 PASS) and is captured in
reports/scrum/_evidence/smoke_chain.log (gitignored — runtime artifact).

Scoring:
  Reproducibility       7/10  9 smokes deterministic, no just/CI gate
  Test Coverage         6/10  internal/ packages tested, 6/7 cmd/ aren't
  Trust Boundary        7/10  escapes ok, zero auth, /sql is RCE-eq off-loopback
  Memory Correctness    3/10  pathway/playbook/observer not yet ported
  Deployment Readiness  4/10  no REPLICATION, no env template, no systemd
  Maintainability       8/10  no god-files, 7 lean binaries, ADRs current

Top three risks:
  R-001 HIGH  queryd /sql + DuckDB + non-loopback bind = RCE-equivalent
  R-002 HIGH  internal/shared (server.go + config.go) zero tests
  R-003 HIGH  internal/storeclient zero tests, used by 2 services
  R-004 MED   9-smoke chain green but not gated (no justfile/hook)

The audit is the work; refactors come after. Sprint 0 owns coverage
+ CI gating; Sprint 1 owns trust-boundary decisions; Sprints 2-3 are
mostly design-bar work for unbuilt agent components.

.gitignore exception: /reports/* + !/reports/scrum/ keeps reports/
a runtime-artifact directory while exposing reports/scrum/ as
tracked documentation. Mirrors the pattern future audit passes will
land in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-29 04:51:47 -05:00

9.2 KiB

Raw Blame History

golangLAKEHOUSE — Acceptance Gates

Definition-of-done for each sprint, expressed as concrete commands a reviewer can run. Every gate is a binary pass/fail; no judgment calls. Sprint backlog (sprint-backlog.md) describes the work; this doc describes the proof of completion.

Format convention

Each gate is:

GATE-<sprint>.<n>: <one-line claim>
  $ <command to run>
  expected: <observable result>
  fails if: <regression condition>

A sprint is "done" when every gate for that sprint passes on a clean clone. CI / pre-push automation should embed these gates so completion is mechanical.

Sprint 0 — Reproducibility Gate

GATE-0.1: just runner is the canonical entry point
  $ just --list
  expected: includes `verify`, `smoke-fixtures`, `doctor`, `fmt`, `vet`, `test`, `smoke <day>`
  fails if: `just` not found or any of the above targets missing

GATE-0.2: deps probe surfaces missing dependencies as structured JSON
  $ just doctor --json
  expected: exit 0 if all deps present; exit 1 with JSON listing missing deps if not
  fails if: any false-positive (claims dep missing when present) or false-negative (claims OK when missing)

GATE-0.3: full chain runs without external services
  $ just smoke-fixtures
  expected: exit 0; uses MockS3Storage + MockEmbedProvider; no MinIO/Ollama dependency
  fails if: smoke-fixtures invokes anything on localhost:9000 or localhost:11434

GATE-0.4: full chain runs against real services
  $ just verify
  expected: exit 0; runs go vet + go test + the 9 smokes; total wall ≤ 60s on this box
  fails if: any individual smoke fails or wall > 90s without a flake annotation

GATE-0.5: pre-push hook blocks regressions
  $ git push (after introducing a regression)
  expected: hook runs `just verify`, push aborts on non-zero exit
  fails if: hook missing, hook does not exit non-zero on test failure, or push proceeds despite failure

GATE-0.6: every internal/ package has at least one test
  $ go test ./internal/... 2>&1 | grep "no test files"
  expected: empty (no packages without tests)
  fails if: `internal/shared` or `internal/storeclient` show as "no test files"

GATE-0.7: every cmd/ binary has at least one test
  $ go test ./cmd/... 2>&1 | grep "no test files"
  expected: empty (no binaries without tests)
  fails if: any cmd/<bin>/main_test.go absent

GATE-0.8: queryd db.go has unit coverage on sqlEscape + redactCreds
  $ go test -run "TestSqlEscape|TestRedactCreds" ./internal/queryd/
  expected: at least one passing test for each function
  fails if: zero matching tests (today's state)

Sprint 1 — Trust Boundary Gate

GATE-1.1: queryd refuses to start on non-loopback bind without explicit override
  $ LH_QUERYD_BIND=0.0.0.0:3214 ./bin/queryd
  expected: exits 1 within 1s; stderr cites the assertion
  fails if: binary starts and accepts connections on 0.0.0.0

GATE-1.2: same gate applies to storaged, ingestd, vectord
  $ for b in storaged ingestd vectord; do LH_${b^^}_BIND=0.0.0.0:99$N ./bin/$b; done
  expected: each exits 1 with cited assertion
  fails if: any binary binds non-loopback silently

GATE-1.3: ADR-003 documents the auth posture
  $ test -f docs/DECISIONS.md && grep -q "ADR-003" docs/DECISIONS.md
  expected: ADR-003 section exists with title + status + rationale
  fails if: ADR-003 absent or marked Draft after sprint close

GATE-1.4: auth middleware applies uniformly when token configured
  $ TOKEN=bad curl -H "Authorization: Bearer $TOKEN" http://127.0.0.1:3110/v1/sql
  expected: 401
  $ TOKEN=valid curl -H "Authorization: Bearer $TOKEN" http://127.0.0.1:3110/v1/sql
  expected: 200 (or 4xx for malformed body, never 401)
  fails if: any binary accepts requests without the configured token

GATE-1.5: every JSON handler rejects unknown fields
  $ curl -X POST http://127.0.0.1:3110/v1/sql -d '{"sql":"SELECT 1","mystery_field":true}'
  expected: 400 with body citing unknown field
  fails if: 200 (silent drop) or 500 (unexpected)

GATE-1.6: SQL injection regression test passes
  $ go test -run "TestRegistrar_QuotesAdversarialName" ./internal/queryd/
  expected: pass
  fails if: test absent or fails — meaning quoteIdent regression is undetected

Sprint 2 — Memory Correctness Gate

GATE-2.1: ADR-004 documents the pathway-memory data model
  $ grep -q "ADR-004" docs/DECISIONS.md
  expected: ADR-004 section exists with trace shape, history rules, retire semantics
  fails if: absent

GATE-2.2: pathway package has full Mem0-shape coverage
  $ go test ./internal/pathway/ -count=1
  expected: all 7+ tests pass: TestAdd, TestUpdate, TestRevise, TestRetire, TestHistory, TestCycleSafe, TestReplayCount, TestCorruptedRow
  fails if: any of those test names absent

GATE-2.3: retired traces are excluded from retrieval
  $ go test -run TestRetire_ExcludedFromSearch ./internal/pathway/
  expected: pass
  $ git revert HEAD --no-commit; (delete the filter); go test -run TestRetire_ExcludedFromSearch
  expected: fail (proves the test is load-bearing, not vacuous)
  fails if: removing the filter doesn't make the test fail

GATE-2.4: vectord persistence works at scale (200K vectors @ d=768)
  $ ./scripts/g1p_scale_smoke.sh
  expected: exit 0; ingests 200K vectors, kills vectord, restarts, search returns dist≤1e-7
  fails if: any operation hits storaged 256 MiB cap or returns > tolerance distance

GATE-2.5: ADR-005 ratifies the storaged-cap fix path
  $ grep -q "ADR-005" docs/DECISIONS.md
  expected: ADR-005 documents B (split LHV1) vs C (multipart in storaged) decision
  fails if: absent

Sprint 3 — Agent Loop Reality Gate

GATE-3.1: ADR-002 defines observer fail-safe semantics
  $ grep -q "ADR-002" docs/DECISIONS.md
  expected: ADR-002 section: degraded-by-default on error, explicit env to opt into fail-open
  fails if: absent

GATE-3.2: observer rejects hallucinated claim
  $ go test -run TestObserver_HallucinatedClaim_Rejected ./internal/observer/
  expected: pass
  fails if: hallucinated-claim path returns accept

GATE-3.3: observer never auto-accepts on internal error
  $ go test -run TestObserver_InternalError_DegradedCycle ./internal/observer/
  expected: pass; response is {verdict: "cycle", degraded: true}
  fails if: any error path can produce {verdict: "accept"}

GATE-3.4: end-to-end agent loop deterministic
  $ ./scripts/agent_loop_smoke.sh
  expected: exit 0; report file at /tmp/agent_loop_<sha>.json contains input_hash, output_hash, verdict, memory_receipt
  fails if: report missing any field or hashes don't match expected fixture

GATE-3.5: second-run retrieval surfaces prior playbook
  $ go test -run TestSecondRun_SurfacesPriorPlaybook ./internal/agent/
  expected: pass
  fails if: second run does not return the UID seen in first run

GATE-3.6: health endpoint content-type regression test
  $ go test ./internal/shared/ -run TestHealth_ContentType
  expected: pass; consumer pattern that called .json() on text/plain returns 502 loudly
  fails if: any /health consumer can silently null on type confusion

Sprint 4 — Deployment Gate

GATE-4.1: fresh-Debian doctor surfaces install commands
  $ docker run --rm -v $PWD:/repo debian:13 bash -c "cd /repo && just doctor"
  expected: structured JSON with apt install / curl tarball commands per missing dep; exit 1
  fails if: silent claim of OK or vague "missing dep" without fix command

GATE-4.2: REPLICATION.md is executable
  $ awk '/^```bash$/,/^```$/' REPLICATION.md | grep -v '^```' | bash
  expected: every code block runs (may require deps; failure must be expected from doctor)
  fails if: REPLICATION contains pseudo-commands or hardcoded paths that don't match repo

GATE-4.3: env template covers every required key
  $ test -f secrets-go.toml.example && grep -q "access_key_id" secrets-go.toml.example
  expected: example file with documented keys; just doctor warns on placeholder values
  fails if: example absent or doesn't surface placeholder detection

GATE-4.4: systemd units present and correct
  $ ls deploy/systemd/*.service | wc -l
  expected: 7 files (one per binary)
  $ systemd-analyze verify deploy/systemd/*.service
  expected: exit 0
  fails if: any unit fails verify or has missing fields (After, Restart, MemoryMax)

GATE-4.5: AWS S3 path works without code changes
  $ AWS_PROFILE=test ./scripts/d2_smoke_aws.sh
  expected: exit 0 against a real S3 bucket
  fails if: any code path assumes MinIO-specific behavior

Cross-sprint compound gate

GATE-FINAL: full clean-clone reproducibility
  $ rm -rf /tmp/golangLAKEHOUSE-test
  $ git clone <url> /tmp/golangLAKEHOUSE-test
  $ cd /tmp/golangLAKEHOUSE-test
  $ just doctor || (read fix instructions, run them, rerun)
  $ just verify
  expected: green within 60s wall of `just verify` (excluding doctor remediation)
  fails if: any step requires undocumented manual intervention

This is the SCRUM.md Sprint 0 ultimate test: "fresh clone can run just doctor; missing
env vars are reported clearly; no absolute path assumptions remain unless configured."

How a future audit verifies these gates

Re-run this audit's commands plus the new gates. Compare scores against golang-lakehouse-scrum-test.md baseline (35/60). A net improvement is the proof the sprints landed; a flat or declining score is signal that the gates were checked-the-box, not internalized.

9.2 KiB Raw Blame History