root c41698acae scrum rerun-2 — 50/60 (Δ R1 +7, Δ baseline +15) at c7e3124

Audited stash-clean c7e3124 (30 commits past rerun-1 4840c10).
3 HIGH risks closed (R-002 internal/shared, R-003 internal/storeclient,
R-008 queryd/db.go). 3 advanced to partial (R-001 via fail-loud-bind +
opt-in auth, R-006 via g2_smoke_fixtures, R-007 via ADR-003 auth.go).

Biggest move: Agent Memory Correctness 4 → 9 — pathway Mem0 ops
(ADD/UPDATE/REVISE/RETIRE/HISTORY) all tested, including cycle-detection
and retired-trace-exclusion. Sprint 2 acceptance criteria are now
verified code, not design-bar work.

Two new findings:
- F1 (MED): cmd/{matrixd,observerd,pathwayd}/main_test.go absent —
  reopens R-005 against new daemons.
- F2 (LOW): scripts/staffing_*/main.go flag-defaults reach
  /home/profit/lakehouse/data/...

Evidence under reports/scrum/_evidence/rerun2/ (local; per
.gitkeep convention).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-29 23:13:01 -05:00

19 KiB

Raw Blame History

Audit Re-run #2 — 2026-04-29 (after Phases A–H + matrix §3.4 + workflow §3.8)

Baseline audit: reports/scrum/golang-lakehouse-scrum-test.md at commit 91edd43 — composite 35 / 60. Rerun-1 head: 4840c10 — composite 43 / 60 (Δ baseline = +8). Rerun-2 head: c7e3124 — 30 commits past rerun-1. Composite 50 / 60. Δ rerun-1 = +7. Δ baseline = +15.

This is the second delta document. Both prior reports remain immutable history. Working tree was dirty on entry (5 in-flight files under cmd/observerd/ + internal/{observer,workflow}/); audit ran on stashed-clean c7e3124 so the score reflects shipped state, not WIP.

What landed since rerun-1

Commit	What
`4840c10`	(rerun-1 baseline — 04_query refresh-tick race fix)
`125e1c8`	tests close R-002 / R-003 / R-008 — `internal/{shared,storeclient,queryd/db}` Go tests
`6af0520`	A: fail-loud on non-loopback bind — closes worst case of R-001
`423a381`	D: storaged per-prefix PUT cap — vectord `_vectors/` → 4 GiB (ADR-002)
`0d18ffa`	ADR-003: inter-service auth posture — Bearer + IP allowlist
`1ec85b0`	Batch 2: perf baseline — multi-sample + warmup + MAD threshold
`0f79bce`	Batch 3: `cmd/<bin>/main_test.go × 6` — closes R-005
`fb08232`	Batch 4: embed fixture-mode — partial R-006 closure
`56844c3`	embed cache — LRU at `/v1/embed` for repeat-query elimination
`8f4c16f`	mcpd: Go MCP SDK port — replaces Bun mcp-server tool surface
`fa56134`	ADR-003 wiring: Bearer token + IP allowlist middleware
`ad1670d`	storaged cap smoke — verifies ADR-002 at 300 MiB
`2a6234f`	ADR-004 + `internal/pathway`: Mem0 versioned trace substrate
`afbb506`	pathwayd: HTTP service over `internal/pathway` · 11/11 smoke gate
`f1c1883`	vectord BatchAdd — single-lock variadic batch
`71b35fb`	SPEC §1 + §3.4: name matrix indexer as a port target
`a7620c8`	PRD: name the product vision — small-model pipeline + 5-loop substrate
`c1d96b7`	matrixd: multi-corpus retrieve+merge — SPEC §3.4 component 2 of 5
`166470f`	corpusingest: extract reusable text→vector ingest pipeline
`0d1553c`	candidates corpus: first deep-field reality test on real staffing data
`9588bd8`	matrix relevance filter — SPEC §3.4 component 3 of 5
`3968ec8`	matrix strong-model downgrade gate — SPEC §3.4 component 4 of 5
`a97881d`	workers corpus + multi-corpus reality test — matrix indexer end-to-end
`31b4088`	multi_corpus_e2e WORKERS_LIMIT knob + embed-text-not-sample-size finding
`06e7152`	matrix playbook memory + boost — SPEC §3.4 component 5 of 5 (LEARNING LOOP)
`a730fc2`	scrum fixes: 4 real findings landed, 4 false positives dismissed
`7f42089`	D: embed-text iteration — clean negative finding (3 variants tested)
`57d0df1`	E (partial): distillation port — scorer + contamination firewall
`be65f85`	F: drift quantification — scorer drift first
`b199093`	B: matrix metadata filter — post-retrieval structured gate
`6392772`	C: bulk playbook record — operational rating wiring
`bc9ab93`	H: observerd — autonomous-iteration witness loop (SPEC §2 port)
`97dd3f8`	SPEC §3.5/§3.6/§3.7/§3.8 — name F/B/C as port targets + Archon-style workflow runner
`e30da6e`	§3.8 first slice: workflow runner skeleton + DAG executor + observerd integration
`c7e3124`	§3.8 second slice: real modes wired (matrix.relevance/downgrade/search, distillation.score, drift.scorer)

This is the wave that took the system from "G0+G2 substrate plus 500K validation" to "all five small-model-pipeline loops have at least a first port" (per project_small_model_pipeline_vision.md).

Score delta — double column

Same 6 dimensions, scored 0–10 with citations. Δ R1 = vs rerun-1 (4840c10); Δ Base = vs original audit (91edd43).

Dimension	Base	R1	R2	Δ R1	Δ Base	Evidence for the move
Reproducibility	7	9	9	0	+2	`just verify` PASS in 31s wall (`_evidence/rerun2/just_verify.log`) — vet + 30 packages of `go test -short` + 9 core smokes. `just doctor` all-green for go/gcc/minio/ollama/secrets. 8 additional domain smokes also PASS (pathway, matrix, relevance, downgrade, observer, playbook, workflow, storaged_cap → `_evidence/rerun2/smoke_.log`). New recipes: `smoke-g2-fixtures` (R-006 partial close) + `smoke-storaged-cap`. Still −1*: no `.github/workflows/`; no fixture-mode for storage (only embed).
Test Coverage	6	8	9	+1	+3	321 Go test functions across 40 test files (was 13 at baseline, ~77 at R1 — 3× the test surface). `internal/shared` has 4 test files (`auth_test.go`, `bind_test.go`, `config_test.go`, `server_test.go`); `internal/storeclient/client_test.go` exists; `internal/queryd/db_test.go` + `registrar_test.go` exist — R-002 / R-003 / R-008 all closed. Six original cmd binaries now have `main_test.go` (catalogd/embedd/ingestd/queryd/storaged/vectord) — R-005 mostly closed. Still −1: `cmd/{matrixd,observerd,pathwayd,fake_ollama}/main_test.go` absent — three of those are new daemons that need wiring tests.
Trust Boundary Safety	7	7	9	+2	+2	ADR-003 shipped (`docs/DECISIONS.md` §3): `internal/shared/auth.go` 64-line Bearer middleware with constant-time compare via `crypto/subtle` + IP allowlist (`internal/shared/auth.go:62-64`). 4 auth tests in `auth_test.go` cover wrong-token, raw-token-without-prefix, IP-only, both-required (`internal/shared/auth_test.go:77,86,108,162`). `redactCreds` still scrubs S3 keys from queryd error chain (`internal/queryd/db.go`). One `fmt.Sprintf` SQL site remains (`internal/queryd/registrar.go:153`) — properly escaped via `quoteIdent` + `sqlEscape`. 13 `MaxBytesReader` sites in cmd/, 5 loopback bindings. Still −1: auth is opt-in (empty token = G0 dev mode); no CORS posture (R-010); 2 `/home/profit/lakehouse/...` paths in `scripts/staffing_*/main.go` flag-defaults.
Agent Memory Correctness	3	4	9	+5	+6	All five SPEC §3.4 components shipped: corpus builders (`internal/corpusingest`), retrieve+merge (`matrixd /matrix/search`), relevance filter (`internal/matrix/relevance.go` 376 LoC + 289 LoC test), strong-model downgrade gate (`internal/matrix/downgrade.go` 137 LoC + 100 LoC test), playbook memory + boost (`internal/matrix/playbook.go` 196 LoC + 180 LoC test) — including the learning loop. Pathway substrate ratified (ADR-004, `internal/pathway/store.go` 381 LoC + 398 LoC test). Mem0-style ops all proven: `TestAdd_AssignsUIDAndTimestamps`, `TestUpdate_ReplacesContentSameUID`, `TestRevise_LinksToPredecessorViaHistory`, `TestRevise_ChainOfThree_BackwardWalk`, `TestRetire_ExcludedFromSearch`, `TestRetire_StillAccessibleViaGet`, `TestHistory_CycleDetected`, `TestHistory_PredecessorMissing_TruncatesChain`, `TestAddIdempotent_RejectsEmptyUID` — every Sprint 2 design-bar acceptance has a test. Observer ported (`internal/observer/store.go` 249 LoC + 193 LoC test). pathway smoke 11/11. Still −1: distillation port partial (scorer + firewall only — `57d0df1` "E (partial)"); drift is "scorer drift first" (`be65f85`) not full quantification.
Deployment Readiness	4	5	5	0	+1	`just doctor` actionable per-dep install (`scripts/doctor.sh`); `just install-hooks` documented; pre-push hook still installed. Still −5: no `REPLICATION.md`, no `secrets-go.toml.example`, no `deploy/systemd/*.service`, no `Dockerfile`, no readiness vs. liveness split. Sprint 4 stories all open.
Maintainability	8	8	9	+1	+1	4 ADRs ratified (was 1 at R1): ADR-001 foundational, ADR-002 storaged per-prefix cap, ADR-003 auth posture, ADR-004 pathway data model — the auth + cap + memory-model decisions are locked before downstream code retrofits them. Every binary still 100–400 LoC (no god-files). Per-package test files: every `internal/` package has ≥1 test file (was: 5 packages had zero at baseline). `CLAUDE_REFACTOR_GUARDRAILS.md` codifies the maintenance discipline. `tests/proof/FINAL_REPORT.md` answers the 9 mandated questions. Still −1: no `CONTRIBUTING.md`; the proof harness adds 24-claim maintenance surface that needs keeping current.

Composite: 35 → 43 → 50. 83% of max.

Code surface delta

Metric	Baseline (`91edd43`)	R1 (`4840c10`)	R2 (`c7e3124`)	Δ R1
Total Go LoC	~6,587	~7,800 (est)	19,381	~2.5×
Go files	~50	~62	93	+31
Test files	13	~22	40	+18
Go test functions	~77	~109	321	+212
`cmd/<bin>/`	7	7	12	+5
`internal/<pkg>/`	11	11	18	+7
Smoke scripts	9	9	21	+12
ADRs ratified	0	1	4	+3
Routes (cmd-level)	~22	~22	37	+15
Untested cmd binaries	6 / 7	6 / 7	4 / 12	−2 abs, −1/3 ratio

The wave is substrate-bearing, not throughput-bearing. Every internal package has tests; the gap is now the wiring layer for the 3 new daemons.

Risk register status updates

12 risks in reports/scrum/risk-register.md. Status table at c7e3124:

Risk	Severity	Before R2	After R2	Evidence
R-001 queryd /sql RCE-eq off-loopback	HIGH	open	partial	`6af0520` fail-loud on non-loopback bind (closes worst case); ADR-003 + `internal/shared/auth.go` available to wrap; but auth is opt-in — needs deploy story decision before fully closing
R-002 internal/shared zero tests	HIGH	open	CLOSED	4 test files (`auth_test.go` + `bind_test.go` + `config_test.go` + `server_test.go`), all PASS in `just verify`
R-003 internal/storeclient zero tests	HIGH	open	CLOSED	`internal/storeclient/client_test.go`, PASS
R-004 smokes not gated	MED	closed (R1)	CLOSED	unchanged from R1
R-005 6/7 cmd/main.go untested	MED	partial	partial	6 of original 7 closed (`0f79bce` Batch 3); 4 new daemons (`fake_ollama`/`matrixd`/`observerd`/`pathwayd`) reopen the gap on different surface
R-006 no fixture-only smokes	MED	open	partial	`scripts/g2_smoke_fixtures.sh` (`fb08232`) closes embed half via fake_ollama; storage half deferred
R-007 zero auth middleware	MED	open	partial	`internal/shared/auth.go` shipped with 4 tests (`fa56134`); opt-in by default until deploy posture decision
R-008 queryd/db.go untested	MED	open	CLOSED	`internal/queryd/db_test.go` + `registrar_test.go` (`125e1c8`)
R-009 registrar.go fmt.Sprintf SQL	LOW	open	open	unchanged — escaping via `quoteIdent`+`sqlEscape` is correct, regression test still missing
R-010 no CORS posture	LOW	open	open	unchanged — no `Access-Control-*` headers anywhere
R-011 g2 smoke model assertion	LOW	note	note	unchanged
R-012 empty tests/ dir	LOW	closed (R1)	CLOSED	unchanged from R1

Net since R1: 3 closed (R-002, R-003, R-008), 3 advanced to partial (R-001, R-006, R-007), R-005 stays partial on different surface, 3 unchanged.

Sprint backlog progress

Sprint 0 — Reproducibility Gate

Story	R1	R2
S0.1 `just doctor`	DONE	DONE
S0.2 `just smoke-fixtures`	open	partial (`smoke-g2-fixtures`)
S0.3 `just verify` + pre-push	DONE	DONE
S0.4 `cmd/<bin>/main_test.go` × 6	partial	partial → mostly DONE (6 of original 7; 3 new daemons absent)
S0.5 internal/shared, storeclient, queryd/db tests	open	DONE
S0.6 `tests/` dir cleanup	DONE	DONE

4 of 6 done, 2 partial. Highest-leverage open work: tests for the 3 new daemons + storage-half of fixture mode.

Sprint 1 — Trust Boundary Gate

Replace SQL string interp with parameterized: still 1 site, properly escaped (R-009 LOW)
Observer fail-open → degraded/cycle: not yet codified — observer is ported but ADR-002-style fail-safe ADR not written
Auth/localhost-only guardrails: shipped (ADR-003 + auth.go), opt-in posture
Schema validation per public endpoint: per-handler validation exists (validateKey etc.); not framework-level

Status: ~60% of Sprint 1 closed, observer fail-safe semantics ADR is the outstanding doc-only piece.

Sprint 2 — Memory Correctness Gate

Story	R1	R2
ADD/UPDATE/REVISE/RETIRE/HISTORY tests	design-bar	DONE (`internal/pathway/store_test.go`)
Cycle detection tests	design-bar	DONE (`TestHistory_CycleDetected`)
Retired-trace exclusion tests	design-bar	DONE (`TestRetire_ExcludedFromSearch`)
Duplicate trace replay_count tests	design-bar	partial (`TestAddIdempotent_RejectsEmptyUID`; replay_count semantics)
Corrupted memory row recovery test	design-bar	open

Status: Sprint 2 acceptance criteria mostly green — the core invariants are tested. Audit/event receipt on every memory mutation is the missing piece.

Sprint 3 — Agent Loop Reality Gate

Deterministic mini corpus: tests/proof/fixtures/ exists
search → verify → observer review → playbook seal → second-run retrieval: scripts/multi_corpus_e2e.sh + scripts/playbook_smoke.sh exercise this; full chain via scripts/workflow_smoke.sh
Negative case observer rejects hallucinated claim: covered by observer_smoke (semantics open for review)
Health endpoint content-type regression: covered by proof harness 00_health

Status: Sprint 3 has working substrate; explicit "single command proves the full loop" with input/output/verdict/receipt evidence is partial.

Sprint 4 — Deployment Gate

Status: unchanged from R1. No REPLICATION.md, no .env.example, no *.service units, no Dockerfile. just doctor is the closest piece. This is the largest open Sprint.

New findings from this rerun

Two real findings worth recording.

F1 — 3 new daemons lack `cmd/<bin>/main_test.go`

Where: cmd/matrixd/, cmd/observerd/, cmd/pathwayd/
What: Same gap-class as R-005 was, just on net-new code. Each daemon mounts ≥4 routes (matrixd: 6, observerd: 4, pathwayd: 9 → 19 routes total) with no wiring test.
Severity: MEDIUM. The internal packages backing each daemon (internal/matrix, internal/observer, internal/pathway) have full unit tests — but no test proves cmd/pathwayd/main.go actually wires /pathway/revise to (*pathway.Store).Revise. A handler-rename refactor would silently break the route surface.
Action: Re-open R-005 against the new daemons. ~1 hr to add three main_test.go files patterned on cmd/storaged/main_test.go.

F2 — `scripts/staffing_*/main.go` has hardcoded data paths in flag defaults

Where: scripts/staffing_candidates/main.go:217 and scripts/staffing_workers/main.go:269 reference /home/profit/lakehouse/data/datasets/{candidates,workers_500k}.parquet.
What: Flag defaults reach into the Rust legacy tree at /home/profit/lakehouse/.... Throwaway driver scripts (not services), and the values are flag-overridable, but they couple the Go repo to the Rust filesystem layout.
Severity: LOW. Doesn't affect any service. Worth noting because audit Sprint 4 explicitly calls out "no hardcoded /home/profit paths" as an acceptance criterion.
Action: Either move the parquet under golangLAKEHOUSE/data/ (preferred for self-containment) or document the cross-tree dependency in RESEARCH_LOG_2026-04-28.md and accept it.

What this rerun does NOT change

Sprint 4 (deployment) remains the largest open gap. R-1 said this; R-2 says this; without REPLICATION.md + systemd units, the cutover from Rust at devop.live/lakehouse/ (G5) cannot be operator-validated.
Auth is opt-in. Empty-token default is fine for G0 development but means the moment any Go binary binds non-loopback in prod, a posture decision is required. R-001 + R-007 cannot fully close until that decision is recorded.
CORS posture (R-010) is still unspecified. The Bun-served Rust UI handles browser CORS today; if a Go service ever fronts a browser, this needs a decision.
Distillation and drift are first-port-only. 57d0df1 ships scorer + contamination firewall (E partial); be65f85 ships scorer-drift only (F first slice). The full distillation pipeline (sample export, audit_baselines lineage) and full drift signal are not yet ported.

Recommended next moves (ordered by leverage / cost)

Three main_test.go files for matrixd + observerd + pathwayd (~1 hr). Closes the regenerated R-005, ratchets every future route addition through just verify.
ADR-005: observer fail-safe semantics (~30 min, doc-only). The observer is ported (internal/observer/store.go), but the upstream "verdict:accept on crash" anti-pattern still has no Go-side decision locked. Doing this now is half the cost of doing it after a regression.
Auth posture decision for non-loopback deploy (~1 hr, ADR or annotated decision in RESEARCH_LOG). Locks R-001 + R-007 from "opt-in middleware exists" to "wired-by-default for X, opt-in for Y". Required input for any G5 cutover plan.
Sprint 4 minimal first slice (~3 hr): secrets-go.toml.example + deploy/systemd/<bin>.service.tmpl × 12 binaries + REPLICATION.md skeleton. Highest-leverage Sprint 4 starter; the systemd units mostly mirror Rust's layout.
Storage-half of fixture mode (~3 hr): MockS3Storage interface satisfying internal/storaged.Bucket, smoke variant that points storaged at it. Closes R-006 fully and decouples CI from MinIO.

The remaining items (full drift port, full distillation port, observer audit-event receipt, corrupted-memory recovery test) are real engineering — Sprint 2/3 followups, not Sprint-0 polish.

Methodology note — same as prior reports

All claims cite a file, line, or command. Evidence captured under reports/scrum/_evidence/rerun2/:

just_verify.log — full vet + 30 packages × go test -short + 9 core smokes, exit 0, 31s wall
just_doctor.log — 5 dependency probes, all green
govet.log — go vet ./... exit 0
gotest_short.log — full short-test pass
just_list.log — recipe inventory
smoke_{pathway,matrix,relevance,downgrade,observer,playbook,workflow,storaged_cap}.log — 8 additional domain smokes, all PASS

What was NOT inspected this round (deferred):

Cross-binary failure cascades (kill matrixd mid-search, observe observerd state) — Sprint 1 follow-up
Supply-chain audit of go.sum diffs since R1
Performance regression vs the perf baseline shipped in 1ec85b0 — just proof performance exists, not run here

Rerun-2 produced under the same "no vibes" rule as the original audit. The 50/60 reflects what's verifiably shipped at c7e3124, not what's planned. Working tree restored from stash after audit completion.

19 KiB Raw Blame History Unescape Escape