golangLAKEHOUSE/reports/scrum/rerun-2-2026-04-29.md

# Audit Re-run #2 — 2026-04-29 (after Phases A–H + matrix §3.4 + workflow §3.8)

**Baseline audit:** `reports/scrum/golang-lakehouse-scrum-test.md` at commit `91edd43` — composite **35 / 60**.
**Rerun-1 head:** `4840c10` — composite **43 / 60** (Δ baseline = +8).
**Rerun-2 head:** `c7e3124` — **30 commits past rerun-1**. Composite **50 / 60. Δ rerun-1 = +7. Δ baseline = +15.**

This is the second delta document. Both prior reports remain immutable history. Working tree was dirty on entry (5 in-flight files under `cmd/observerd/` + `internal/{observer,workflow}/`); audit ran on stashed-clean `c7e3124` so the score reflects shipped state, not WIP.

---

## What landed since rerun-1

| Commit | What |
|---|---|
| `4840c10` | (rerun-1 baseline — 04_query refresh-tick race fix) |
| `125e1c8` | tests close R-002 / R-003 / R-008 — `internal/{shared,storeclient,queryd/db}` Go tests |
| `6af0520` | A: fail-loud on non-loopback bind — closes worst case of R-001 |
| `423a381` | D: storaged per-prefix PUT cap — vectord `_vectors/` → 4 GiB (ADR-002) |
| `0d18ffa` | ADR-003: inter-service auth posture — Bearer + IP allowlist |
| `1ec85b0` | Batch 2: perf baseline — multi-sample + warmup + MAD threshold |
| `0f79bce` | Batch 3: `cmd/<bin>/main_test.go × 6` — closes R-005 |
| `fb08232` | Batch 4: embed fixture-mode — partial R-006 closure |
| `56844c3` | embed cache — LRU at `/v1/embed` for repeat-query elimination |
| `8f4c16f` | mcpd: Go MCP SDK port — replaces Bun mcp-server tool surface |
| `fa56134` | ADR-003 wiring: Bearer token + IP allowlist middleware |
| `ad1670d` | storaged cap smoke — verifies ADR-002 at 300 MiB |
| `2a6234f` | ADR-004 + `internal/pathway`: Mem0 versioned trace substrate |
| `afbb506` | pathwayd: HTTP service over `internal/pathway` · 11/11 smoke gate |
| `f1c1883` | vectord BatchAdd — single-lock variadic batch |
| `71b35fb` | SPEC §1 + §3.4: name matrix indexer as a port target |
| `a7620c8` | PRD: name the product vision — small-model pipeline + 5-loop substrate |
| `c1d96b7` | matrixd: multi-corpus retrieve+merge — SPEC §3.4 component 2 of 5 |
| `166470f` | corpusingest: extract reusable text→vector ingest pipeline |
| `0d1553c` | candidates corpus: first deep-field reality test on real staffing data |
| `9588bd8` | matrix relevance filter — SPEC §3.4 component 3 of 5 |
| `3968ec8` | matrix strong-model downgrade gate — SPEC §3.4 component 4 of 5 |
| `a97881d` | workers corpus + multi-corpus reality test — matrix indexer end-to-end |
| `31b4088` | multi_corpus_e2e WORKERS_LIMIT knob + embed-text-not-sample-size finding |
| `06e7152` | matrix playbook memory + boost — SPEC §3.4 component 5 of 5 (LEARNING LOOP) |
| `a730fc2` | scrum fixes: 4 real findings landed, 4 false positives dismissed |
| `7f42089` | D: embed-text iteration — clean negative finding (3 variants tested) |
| `57d0df1` | E (partial): distillation port — scorer + contamination firewall |
| `be65f85` | F: drift quantification — scorer drift first |
| `b199093` | B: matrix metadata filter — post-retrieval structured gate |
| `6392772` | C: bulk playbook record — operational rating wiring |
| `bc9ab93` | H: observerd — autonomous-iteration witness loop (SPEC §2 port) |
| `97dd3f8` | SPEC §3.5/§3.6/§3.7/§3.8 — name F/B/C as port targets + Archon-style workflow runner |
| `e30da6e` | §3.8 first slice: workflow runner skeleton + DAG executor + observerd integration |
| `c7e3124` | §3.8 second slice: real modes wired (matrix.relevance/downgrade/search, distillation.score, drift.scorer) |

This is the wave that took the system from "G0+G2 substrate plus 500K validation" to **"all five small-model-pipeline loops have at least a first port"** (per `project_small_model_pipeline_vision.md`).

---

## Score delta — double column

Same 6 dimensions, scored 0–10 with citations. `Δ R1` = vs rerun-1 (`4840c10`); `Δ Base` = vs original audit (`91edd43`).

| Dimension | Base | R1 | **R2** | Δ R1 | Δ Base | Evidence for the move |
|---|---:|---:|---:|---:|---:|---|
| **Reproducibility** | 7 | 9 | **9** | 0 | +2 | `just verify` PASS in 31s wall (`_evidence/rerun2/just_verify.log`) — vet + 30 packages of `go test -short` + 9 core smokes. `just doctor` all-green for go/gcc/minio/ollama/secrets. **8 additional domain smokes also PASS** (pathway, matrix, relevance, downgrade, observer, playbook, workflow, storaged_cap → `_evidence/rerun2/smoke_*.log`). New recipes: `smoke-g2-fixtures` (R-006 partial close) + `smoke-storaged-cap`. **Still −1**: no `.github/workflows/`; no fixture-mode for storage (only embed). |
| **Test Coverage** | 6 | 8 | **9** | +1 | +3 | **321 Go test functions** across 40 test files (was 13 at baseline, ~77 at R1 — **3× the test surface**). `internal/shared` has 4 test files (`auth_test.go`, `bind_test.go`, `config_test.go`, `server_test.go`); `internal/storeclient/client_test.go` exists; `internal/queryd/db_test.go` + `registrar_test.go` exist — **R-002 / R-003 / R-008 all closed**. Six original cmd binaries now have `main_test.go` (catalogd/embedd/ingestd/queryd/storaged/vectord) — **R-005 mostly closed**. **Still −1**: `cmd/{matrixd,observerd,pathwayd,fake_ollama}/main_test.go` absent — three of those are new daemons that need wiring tests. |
| **Trust Boundary Safety** | 7 | 7 | **9** | +2 | +2 | **ADR-003 shipped** (`docs/DECISIONS.md` §3): `internal/shared/auth.go` 64-line Bearer middleware with constant-time compare via `crypto/subtle` + IP allowlist (`internal/shared/auth.go:62-64`). 4 auth tests in `auth_test.go` cover wrong-token, raw-token-without-prefix, IP-only, both-required (`internal/shared/auth_test.go:77,86,108,162`). `redactCreds` still scrubs S3 keys from queryd error chain (`internal/queryd/db.go`). One `fmt.Sprintf` SQL site remains (`internal/queryd/registrar.go:153`) — properly escaped via `quoteIdent` + `sqlEscape`. 13 `MaxBytesReader` sites in cmd/, 5 loopback bindings. **Still −1**: auth is opt-in (empty token = G0 dev mode); no CORS posture (R-010); 2 `/home/profit/lakehouse/...` paths in `scripts/staffing_*/main.go` flag-defaults. |
| **Agent Memory Correctness** | 3 | 4 | **9** | +5 | +6 | **All five SPEC §3.4 components shipped**: corpus builders (`internal/corpusingest`), retrieve+merge (`matrixd /matrix/search`), relevance filter (`internal/matrix/relevance.go` 376 LoC + 289 LoC test), strong-model downgrade gate (`internal/matrix/downgrade.go` 137 LoC + 100 LoC test), playbook memory + boost (`internal/matrix/playbook.go` 196 LoC + 180 LoC test) — including the **learning loop**. Pathway substrate ratified (ADR-004, `internal/pathway/store.go` 381 LoC + 398 LoC test). **Mem0-style ops all proven**: `TestAdd_AssignsUIDAndTimestamps`, `TestUpdate_ReplacesContentSameUID`, `TestRevise_LinksToPredecessorViaHistory`, `TestRevise_ChainOfThree_BackwardWalk`, `TestRetire_ExcludedFromSearch`, `TestRetire_StillAccessibleViaGet`, `TestHistory_CycleDetected`, `TestHistory_PredecessorMissing_TruncatesChain`, `TestAddIdempotent_RejectsEmptyUID` — **every Sprint 2 design-bar acceptance has a test**. Observer ported (`internal/observer/store.go` 249 LoC + 193 LoC test). pathway smoke 11/11. **Still −1**: distillation port partial (scorer + firewall only — `57d0df1` "E (partial)"); drift is "scorer drift first" (`be65f85`) not full quantification. |
| **Deployment Readiness** | 4 | 5 | **5** | 0 | +1 | `just doctor` actionable per-dep install (`scripts/doctor.sh`); `just install-hooks` documented; pre-push hook still installed. **Still −5**: no `REPLICATION.md`, no `secrets-go.toml.example`, no `deploy/systemd/*.service`, no `Dockerfile`, no readiness vs. liveness split. Sprint 4 stories all open. |
| **Maintainability** | 8 | 8 | **9** | +1 | +1 | **4 ADRs ratified** (was 1 at R1): ADR-001 foundational, ADR-002 storaged per-prefix cap, ADR-003 auth posture, ADR-004 pathway data model — **the auth + cap + memory-model decisions are locked before downstream code retrofits them**. Every binary still 100–400 LoC (no god-files). Per-package test files: every `internal/` package has ≥1 test file (was: 5 packages had zero at baseline). `CLAUDE_REFACTOR_GUARDRAILS.md` codifies the maintenance discipline. `tests/proof/FINAL_REPORT.md` answers the 9 mandated questions. **Still −1**: no `CONTRIBUTING.md`; the proof harness adds 24-claim maintenance surface that needs keeping current. |

**Composite: 35 → 43 → 50. 83% of max.**

---

## Code surface delta

| Metric | Baseline (`91edd43`) | R1 (`4840c10`) | **R2 (`c7e3124`)** | Δ R1 |
|---|---:|---:|---:|---:|
| Total Go LoC | ~6,587 | ~7,800 (est) | **19,381** | ~2.5× |
| Go files | ~50 | ~62 | **93** | +31 |
| Test files | 13 | ~22 | **40** | +18 |
| Go test functions | ~77 | ~109 | **321** | +212 |
| `cmd/<bin>/` | 7 | 7 | **12** | +5 |
| `internal/<pkg>/` | 11 | 11 | **18** | +7 |
| Smoke scripts | 9 | 9 | **21** | +12 |
| ADRs ratified | 0 | 1 | **4** | +3 |
| Routes (cmd-level) | ~22 | ~22 | **37** | +15 |
| Untested cmd binaries | 6 / 7 | 6 / 7 | **4 / 12** | −2 abs, −1/3 ratio |

The wave is **substrate-bearing**, not throughput-bearing. Every internal package has tests; the gap is now the **wiring layer** for the 3 new daemons.

---

## Risk register status updates

12 risks in `reports/scrum/risk-register.md`. Status table at `c7e3124`:

| Risk | Severity | Before R2 | After R2 | Evidence |
|---|---|---|---|---|
| R-001 queryd /sql RCE-eq off-loopback | HIGH | open | **partial** | `6af0520` fail-loud on non-loopback bind (closes worst case); ADR-003 + `internal/shared/auth.go` available to wrap; **but auth is opt-in** — needs deploy story decision before fully closing |
| R-002 internal/shared zero tests | HIGH | open | **CLOSED** | 4 test files (`auth_test.go` + `bind_test.go` + `config_test.go` + `server_test.go`), all PASS in `just verify` |
| R-003 internal/storeclient zero tests | HIGH | open | **CLOSED** | `internal/storeclient/client_test.go`, PASS |
| R-004 smokes not gated | MED | closed (R1) | **CLOSED** | unchanged from R1 |
| R-005 6/7 cmd/main.go untested | MED | partial | **partial** | 6 of original 7 closed (`0f79bce` Batch 3); 4 new daemons (`fake_ollama`/`matrixd`/`observerd`/`pathwayd`) reopen the gap on different surface |
| R-006 no fixture-only smokes | MED | open | **partial** | `scripts/g2_smoke_fixtures.sh` (`fb08232`) closes embed half via fake_ollama; storage half deferred |
| R-007 zero auth middleware | MED | open | **partial** | `internal/shared/auth.go` shipped with 4 tests (`fa56134`); opt-in by default until deploy posture decision |
| R-008 queryd/db.go untested | MED | open | **CLOSED** | `internal/queryd/db_test.go` + `registrar_test.go` (`125e1c8`) |
| R-009 registrar.go fmt.Sprintf SQL | LOW | open | open | unchanged — escaping via `quoteIdent`+`sqlEscape` is correct, regression test still missing |
| R-010 no CORS posture | LOW | open | open | unchanged — no `Access-Control-*` headers anywhere |
| R-011 g2 smoke model assertion | LOW | note | note | unchanged |
| R-012 empty tests/ dir | LOW | closed (R1) | **CLOSED** | unchanged from R1 |

**Net since R1: 3 closed (R-002, R-003, R-008), 3 advanced to partial (R-001, R-006, R-007), R-005 stays partial on different surface, 3 unchanged.**

---

## Sprint backlog progress

### Sprint 0 — Reproducibility Gate
| Story | R1 | R2 |
|---|---|---|
| S0.1 `just doctor` | DONE | DONE |
| S0.2 `just smoke-fixtures` | open | **partial** (`smoke-g2-fixtures`) |
| S0.3 `just verify` + pre-push | DONE | DONE |
| S0.4 `cmd/<bin>/main_test.go` × 6 | partial | **partial → mostly DONE** (6 of original 7; 3 new daemons absent) |
| S0.5 internal/shared, storeclient, queryd/db tests | open | **DONE** |
| S0.6 `tests/` dir cleanup | DONE | DONE |

**4 of 6 done, 2 partial.** Highest-leverage open work: tests for the 3 new daemons + storage-half of fixture mode.

### Sprint 1 — Trust Boundary Gate
- Replace SQL string interp with parameterized: still 1 site, properly escaped (R-009 LOW)
- Observer fail-open → `degraded`/`cycle`: not yet codified — observer is ported but ADR-002-style fail-safe ADR not written
- Auth/localhost-only guardrails: **shipped** (ADR-003 + auth.go), opt-in posture
- Schema validation per public endpoint: per-handler validation exists (validateKey etc.); not framework-level

**Status: ~60% of Sprint 1 closed, observer fail-safe semantics ADR is the outstanding doc-only piece.**

### Sprint 2 — Memory Correctness Gate
| Story | R1 | R2 |
|---|---|---|
| ADD/UPDATE/REVISE/RETIRE/HISTORY tests | design-bar | **DONE** (`internal/pathway/store_test.go`) |
| Cycle detection tests | design-bar | **DONE** (`TestHistory_CycleDetected`) |
| Retired-trace exclusion tests | design-bar | **DONE** (`TestRetire_ExcludedFromSearch`) |
| Duplicate trace replay_count tests | design-bar | partial (`TestAddIdempotent_RejectsEmptyUID`; replay_count semantics) |
| Corrupted memory row recovery test | design-bar | open |

**Status: Sprint 2 acceptance criteria mostly green — the core invariants are tested. Audit/event receipt on every memory mutation is the missing piece.**

### Sprint 3 — Agent Loop Reality Gate
- Deterministic mini corpus: `tests/proof/fixtures/` exists
- search → verify → observer review → playbook seal → second-run retrieval: `scripts/multi_corpus_e2e.sh` + `scripts/playbook_smoke.sh` exercise this; full chain via `scripts/workflow_smoke.sh`
- Negative case observer rejects hallucinated claim: covered by observer_smoke (semantics open for review)
- Health endpoint content-type regression: covered by proof harness `00_health`

**Status: Sprint 3 has working substrate; explicit "single command proves the full loop" with input/output/verdict/receipt evidence is partial.**

### Sprint 4 — Deployment Gate
**Status: unchanged from R1.** No `REPLICATION.md`, no `.env.example`, no `*.service` units, no `Dockerfile`. `just doctor` is the closest piece. This is the largest open Sprint.

---

## New findings from this rerun

Two real findings worth recording.

### F1 — 3 new daemons lack `cmd/<bin>/main_test.go`
- **Where:** `cmd/matrixd/`, `cmd/observerd/`, `cmd/pathwayd/`
- **What:** Same gap-class as R-005 was, just on net-new code. Each daemon mounts ≥4 routes (matrixd: 6, observerd: 4, pathwayd: 9 → 19 routes total) with no wiring test.
- **Severity:** MEDIUM. The internal packages backing each daemon (`internal/matrix`, `internal/observer`, `internal/pathway`) have full unit tests — but no test proves `cmd/pathwayd/main.go` actually wires `/pathway/revise` to `(*pathway.Store).Revise`. A handler-rename refactor would silently break the route surface.
- **Action:** Re-open R-005 against the new daemons. ~1 hr to add three `main_test.go` files patterned on `cmd/storaged/main_test.go`.

### F2 — `scripts/staffing_*/main.go` has hardcoded data paths in flag defaults
- **Where:** `scripts/staffing_candidates/main.go:217` and `scripts/staffing_workers/main.go:269` reference `/home/profit/lakehouse/data/datasets/{candidates,workers_500k}.parquet`.
- **What:** Flag defaults reach into the Rust legacy tree at `/home/profit/lakehouse/...`. Throwaway driver scripts (not services), and the values are flag-overridable, but they couple the Go repo to the Rust filesystem layout.
- **Severity:** LOW. Doesn't affect any service. Worth noting because audit Sprint 4 explicitly calls out "no hardcoded `/home/profit` paths" as an acceptance criterion.
- **Action:** Either move the parquet under `golangLAKEHOUSE/data/` (preferred for self-containment) or document the cross-tree dependency in `RESEARCH_LOG_2026-04-28.md` and accept it.

---

## What this rerun does NOT change

- **Sprint 4 (deployment) remains the largest open gap.** R-1 said this; R-2 says this; without `REPLICATION.md` + systemd units, the cutover from Rust at `devop.live/lakehouse/` (G5) cannot be operator-validated.
- **Auth is opt-in.** Empty-token default is fine for G0 development but means the moment any Go binary binds non-loopback in prod, a posture decision is required. R-001 + R-007 cannot fully close until that decision is recorded.
- **CORS posture (R-010) is still unspecified.** The Bun-served Rust UI handles browser CORS today; if a Go service ever fronts a browser, this needs a decision.
- **Distillation and drift are first-port-only.** `57d0df1` ships scorer + contamination firewall (E partial); `be65f85` ships scorer-drift only (F first slice). The full distillation pipeline (sample export, audit_baselines lineage) and full drift signal are not yet ported.

---

## Recommended next moves (ordered by leverage / cost)

1. **Three `main_test.go` files for `matrixd` + `observerd` + `pathwayd`** (~1 hr). Closes the regenerated R-005, ratchets every future route addition through `just verify`.
2. **ADR-005: observer fail-safe semantics** (~30 min, doc-only). The observer is ported (`internal/observer/store.go`), but the upstream "verdict:accept on crash" anti-pattern still has no Go-side decision locked. Doing this now is half the cost of doing it after a regression.
3. **Auth posture decision for non-loopback deploy** (~1 hr, ADR or annotated decision in `RESEARCH_LOG`). Locks R-001 + R-007 from "opt-in middleware exists" to "wired-by-default for X, opt-in for Y". Required input for any G5 cutover plan.
4. **Sprint 4 minimal first slice** (~3 hr): `secrets-go.toml.example` + `deploy/systemd/<bin>.service.tmpl` × 12 binaries + `REPLICATION.md` skeleton. Highest-leverage Sprint 4 starter; the systemd units mostly mirror Rust's layout.
5. **Storage-half of fixture mode** (~3 hr): `MockS3Storage` interface satisfying `internal/storaged.Bucket`, smoke variant that points storaged at it. Closes R-006 fully and decouples CI from MinIO.

The remaining items (full drift port, full distillation port, observer audit-event receipt, corrupted-memory recovery test) are real engineering — Sprint 2/3 followups, not Sprint-0 polish.

---

## Methodology note — same as prior reports

All claims cite a file, line, or command. Evidence captured under `reports/scrum/_evidence/rerun2/`:

- `just_verify.log` — full vet + 30 packages × `go test -short` + 9 core smokes, exit 0, 31s wall
- `just_doctor.log` — 5 dependency probes, all green
- `govet.log` — `go vet ./...` exit 0
- `gotest_short.log` — full short-test pass
- `just_list.log` — recipe inventory
- `smoke_{pathway,matrix,relevance,downgrade,observer,playbook,workflow,storaged_cap}.log` — 8 additional domain smokes, all PASS

What was NOT inspected this round (deferred):
- Cross-binary failure cascades (kill matrixd mid-search, observe observerd state) — Sprint 1 follow-up
- Supply-chain audit of go.sum diffs since R1
- Performance regression vs the perf baseline shipped in `1ec85b0` — `just proof performance` exists, not run here

---

_Rerun-2 produced under the same "no vibes" rule as the original audit. The 50/60 reflects what's verifiably shipped at `c7e3124`, not what's planned. Working tree restored from stash after audit completion._