golangLAKEHOUSE/reports/scrum/rerun-2026-04-29.md
root ff9823b871 scrum audit re-run: 35 → 43 / 60 after Phase A-E + S0.3
Re-runs the SCRUM.md framework against HEAD (4840c10) to score the
delta from the audit baseline at 91edd43. Composite +8.

Scoring deltas:
  Reproducibility       7 → 9  (just verify, just doctor, pre-push hook)
  Test Coverage         6 → 8  (168 proof harness assertions; Go-test
                                gaps in shared/storeclient remain)
  Trust Boundary        7 → 7  (no code change; R-001/R-007 open)
  Memory Correctness    3 → 4  (vectord persistence proven; Mem0
                                pathway/playbook still not ported)
  Deployment Readiness  4 → 5  (just doctor; REPLICATION/systemd open)
  Maintainability       8 → 8  (spine unchanged; harness obeys
                                CLAUDE_REFACTOR_GUARDRAILS)

Risk register changes:
  R-004 (smokes not gated)        CLOSED — just verify + pre-push hook
  R-005 (cmd/main.go untested)    partial — proof harness covers wiring
  R-012 (empty tests/ dir)        CLOSED — populated by harness
  R-001/R-002/R-003/R-006/R-007/R-008/R-009/R-010 unchanged

Sprint 0 progress:
  S0.1 just doctor               DONE
  S0.3 just verify + pre-push    DONE
  S0.6 tests/ dir cleanup        DONE
  S0.2 just smoke-fixtures       open
  S0.4 cmd/main_test × 6         partial (harness coverage; go-test gap)
  S0.5 shared/storeclient tests  open  (HIGH risks still unaddressed)

New finding from this rerun (worth recording):
  Queryd refresh-tick race in 04_query_correctness — cache-warm
  binaries fire SELECTs faster than queryd's 500ms refresh tick.
  Caught by integration mode going 104/0/1 → 102/1/1, fixed at
  4840c10 with proof_wait_for_sql helper. Exactly the failure-mode
  the harness was designed to catch.

Original 5 audit reports preserved as immutable history at
91edd43; this file documents the delta only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 05:37:45 -05:00

125 lines
8.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Audit Re-run — 2026-04-29 (after Phase E)
**Baseline audit:** `reports/scrum/golang-lakehouse-scrum-test.md` at commit `91edd43`. Composite score: **35 / 60.**
**Rerun head:** `4840c10` — 6 commits past baseline. Composite score: **43 / 60. Δ = +8.**
This is a delta document, not a replacement. The original audit's 5 reports (top-line, risk-register, claim-coverage, sprint-backlog, acceptance-gates) are immutable history. This file documents what changed and what didn't.
---
## What landed since the audit
| Commit | What |
|---|---|
| `91edd43` | (audit baseline — 5 reports under reports/scrum/) |
| `e316382` | S0.3 — `just verify` + `just doctor` + pre-push hook |
| `a81291e` | Proof Phase A — scaffolding + 00_health canary |
| `6d18394` | Proof Phase B — 4 contract cases · 53/0/1 |
| `1313eb2` | Proof Phase C — 6 integration cases · 104/0/1 |
| `175ad59` | Proof Phase D — perf baseline · 1000-row ingest, p50/p95 |
| `4bb6548` | Proof Phase E — FINAL_REPORT.md (9 mandated questions) |
| `4840c10` | Race fix in 04_query (this rerun caught it) |
All commits preserved `just verify` regression-green. Pre-push hook would have blocked any of them otherwise.
---
## Score delta with evidence
Same 6 dimensions, scored 0-10 each. Same "no vibes" rule — every line below cites a file or command.
| Dimension | Was | Now | Δ | Evidence for the move |
|---|---:|---:|---:|---|
| **Reproducibility** | 7 | **9** | +2 | `just verify` exists, runs vet+test+9-smokes in 33s wall (`scripts/d1..g2_smoke.sh`). `just doctor` probes Go/gcc/MinIO/Ollama/secrets-go.toml with structured output (`scripts/doctor.sh`). Pre-push hook installed by `just install-hooks` runs `just verify` before allowing push (`.git/hooks/pre-push`). **Still missing 1:** no `.github/workflows/`, no fixture-only smoke path (R-006). |
| **Test Coverage** | 6 | **8** | +2 | 168 assertions across 11 proof cases (53 contract + 104 integration + 110 perf). `tests/proof/reports/proof-<ts>/raw/cases/<CASE_ID>.jsonl` per-assertion evidence chain. Wiring regressions in `cmd/<bin>/main.go` now fail `just proof contract`. **Still missing 2:** `internal/shared` and `internal/storeclient` still zero Go tests (R-002 + R-003); 6 of 7 `cmd/<bin>/main_test.go` still absent (R-005). |
| **Trust Boundary Safety** | 7 | **7** | 0 | No code-level changes to auth, CORS, or SQL boundary. The harness exercises every route extensively — proves they behave under valid + invalid input — but cannot evaluate the auth posture (zero auth middleware is still an architectural decision pending ADR-003). R-001 / R-007 / R-010 unchanged. |
| **Agent Memory Correctness** | 3 | **4** | +1 | Vectord persistence now has a 7-assertion case (`07_vector_persistence_restart`) that kill+restarts vectord and verifies bit-identical top-1 distance. Mem0 / pathway / playbook / observer still not ported (Sprint 2 design bars unchanged). +1 reflects the persistence claim being proven, not the larger memory system being built. |
| **Deployment Readiness** | 4 | **5** | +1 | `just doctor` provides actionable per-dep install commands (`scripts/doctor.sh:30-89`). README has a "Task runner" section documenting `just install-hooks` on cold-start. **Still missing 5:** no `REPLICATION.md`, no `secrets-go.toml.example`, no `deploy/systemd/*.service`, no `Dockerfile`. Sprint 4 stories all open. |
| **Maintainability** | 8 | **8** | 0 | No spine-binary code touched. The proof harness is test code under `tests/proof/`; the 7-binary split + ADRs unchanged. The harness adds maintenance surface (24 claims to keep current) — but per CLAUDE_REFACTOR_GUARDRAILS.md, the guardrails ARE the maintenance discipline, and they were enforced through every Phase commit. |
**Composite: 35 → 43 (+8). 71.7% of max.**
---
## Risk register status updates
12 risks in `reports/scrum/risk-register.md`. Status changes at this SHA:
| Risk | Severity | Status before | Status now | Evidence |
|---|---|---|---|---|
| R-001 queryd /sql RCE-eq off-loopback | HIGH | open | open | unchanged — needs ADR-003 + auth middleware |
| R-002 internal/shared zero tests | HIGH | open | open | `go test ./internal/shared/` still "no test files" |
| R-003 internal/storeclient zero tests | HIGH | open | open | same shape |
| **R-004** smokes not gated | MED | open | **CLOSED** | `just verify` + `.git/hooks/pre-push` + README docs (`e316382`) |
| R-005 6/7 cmd/main.go untested | MED | open | **partial** | proof harness exercises every route via `00_health`, `08_gateway_contracts`, etc.; Go-test gap remains |
| R-006 no fixture-only smokes | MED | open | open | proof harness still requires real MinIO + Ollama; fixture-mode story is Sprint 0 follow-up |
| R-007 zero auth middleware | MED | open | open | unchanged — paired with R-001 |
| R-008 queryd/db.go untested | MED | open | open | unchanged — `sqlEscape` + `redactCreds` still no unit tests |
| R-009 registrar.go fmt.Sprintf SQL | LOW | open | open | regression test still not added |
| R-010 no CORS posture | LOW | open | open | unchanged |
| R-011 g2 smoke model assertion | LOW | (note only) | (note only) | unchanged |
| R-012 empty tests/ dir | LOW | open | **CLOSED** | `tests/proof/` populated with the harness (1313eb2 et al.) |
**Net: 2 closed, 1 partial, 9 unchanged.**
---
## Sprint backlog progress
From `reports/scrum/sprint-backlog.md`:
### Sprint 0 — Reproducibility Gate
| Story | Status |
|---|---|
| S0.1 `just doctor` | **DONE** (`e316382``scripts/doctor.sh` with --json) |
| S0.2 `just smoke-fixtures` (mock-mode) | open — fixture-mode interfaces not implemented |
| S0.3 `just verify` + pre-push hook | **DONE** (`e316382`) |
| S0.4 `cmd/<bin>/main_test.go` × 6 | partial — proof harness covers wiring; Go-test gap remains |
| S0.5 internal/shared, internal/storeclient, internal/queryd/db.go tests | open — three untested packages flagged HIGH-risk |
| S0.6 `tests/` dir cleanup | **DONE** — populated by proof harness |
3 of 6 done, 1 partial. Remaining: S0.2, S0.4 (Go-test layer), S0.5 (the highest-leverage gap).
### Sprint 1-4 — unchanged
Sprints 1 (trust boundary), 2 (memory correctness), 3 (agent loop), 4 (deployment) are all open. The proof harness validates *what the system claims today*; it does not advance any of these sprints' code.
---
## New finding from this rerun
Worth recording — exactly the kind of bug the harness exists for.
**Queryd refresh-tick race in 04_query_correctness.**
With cache-warm binaries, the proof harness's 04 case fires its first SELECT faster than queryd's 500ms refresh tick that picks up 03's just-ingested manifest. Q1 returned 400 ("table not found"); subsequent queries (after the tick) succeeded.
- Caught by: this audit re-run on `4bb6548`, integration mode 102 pass / 1 fail.
- Root cause: case execution speed exceeded queryd's eventual-consistency window after the binaries warmed up.
- Fix at `4840c10`: added `proof_wait_for_sql` helper to `tests/proof/lib/http.sh`; `04_query_correctness.sh` now waits up to 5s for the view before running queries.
- Why this is OK (not a retry): queryd's contract is "views appear within one tick of catalogd having the manifest." We're waiting for the contract, not retrying around a bug.
- Generalization: this race exists for any future case that follows an ingest. The helper is reusable.
**This is the harness self-improving on its first re-execution after Phase D shipped.** Worth noting in any future audit pass that uncovers similar timing-sensitive cases.
---
## What this rerun does NOT change
- The HIGH-risk findings are the highest-leverage work, and none of them are addressed by the harness.
- Auth posture decision still gating R-001 + R-007.
- Untested packages (`internal/shared`, `internal/storeclient`) still load-bearing-but-fragile.
- The harness adds a *detection* layer; *prevention* + *correctness* layers (typed handler tests, tighter validation, auth middleware) are still Sprint 0/1 work.
---
## Recommended next move
Same as `golang-lakehouse-scrum-test.md` "Top recommendations" section:
1. Tests for `internal/shared` and `internal/storeclient` (~1 hr). Closes R-002 + R-003. Highest-leverage two HIGH risks unaddressed by the harness.
2. ADR-002 observer fail-safe semantics + ADR-003 auth posture (~1 hr doc-only). Locks both decisions before R-001 + R-007 retrofit cost.
3. Fixture-mode smokes (R-006, S0.2) (~3 hr). Decouples CI / fresh-clone reviewers from MinIO + Ollama.
The proof harness is in maintenance posture — fix when failing, extend when adding service surfaces, otherwise leave alone.