golangLAKEHOUSE/reports/scrum/rerun-2-2026-04-29.md
root c41698acae scrum rerun-2 — 50/60 (Δ R1 +7, Δ baseline +15) at c7e3124
Audited stash-clean c7e3124 (30 commits past rerun-1 4840c10).
3 HIGH risks closed (R-002 internal/shared, R-003 internal/storeclient,
R-008 queryd/db.go). 3 advanced to partial (R-001 via fail-loud-bind +
opt-in auth, R-006 via g2_smoke_fixtures, R-007 via ADR-003 auth.go).

Biggest move: Agent Memory Correctness 4 → 9 — pathway Mem0 ops
(ADD/UPDATE/REVISE/RETIRE/HISTORY) all tested, including cycle-detection
and retired-trace-exclusion. Sprint 2 acceptance criteria are now
verified code, not design-bar work.

Two new findings:
- F1 (MED): cmd/{matrixd,observerd,pathwayd}/main_test.go absent —
  reopens R-005 against new daemons.
- F2 (LOW): scripts/staffing_*/main.go flag-defaults reach
  /home/profit/lakehouse/data/...

Evidence under reports/scrum/_evidence/rerun2/ (local; per
.gitkeep convention).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 23:13:01 -05:00

19 KiB
Raw Blame History

Audit Re-run #2 — 2026-04-29 (after Phases AH + matrix §3.4 + workflow §3.8)

Baseline audit: reports/scrum/golang-lakehouse-scrum-test.md at commit 91edd43 — composite 35 / 60. Rerun-1 head: 4840c10 — composite 43 / 60 (Δ baseline = +8). Rerun-2 head: c7e312430 commits past rerun-1. Composite 50 / 60. Δ rerun-1 = +7. Δ baseline = +15.

This is the second delta document. Both prior reports remain immutable history. Working tree was dirty on entry (5 in-flight files under cmd/observerd/ + internal/{observer,workflow}/); audit ran on stashed-clean c7e3124 so the score reflects shipped state, not WIP.


What landed since rerun-1

Commit What
4840c10 (rerun-1 baseline — 04_query refresh-tick race fix)
125e1c8 tests close R-002 / R-003 / R-008 — internal/{shared,storeclient,queryd/db} Go tests
6af0520 A: fail-loud on non-loopback bind — closes worst case of R-001
423a381 D: storaged per-prefix PUT cap — vectord _vectors/ → 4 GiB (ADR-002)
0d18ffa ADR-003: inter-service auth posture — Bearer + IP allowlist
1ec85b0 Batch 2: perf baseline — multi-sample + warmup + MAD threshold
0f79bce Batch 3: cmd/<bin>/main_test.go × 6 — closes R-005
fb08232 Batch 4: embed fixture-mode — partial R-006 closure
56844c3 embed cache — LRU at /v1/embed for repeat-query elimination
8f4c16f mcpd: Go MCP SDK port — replaces Bun mcp-server tool surface
fa56134 ADR-003 wiring: Bearer token + IP allowlist middleware
ad1670d storaged cap smoke — verifies ADR-002 at 300 MiB
2a6234f ADR-004 + internal/pathway: Mem0 versioned trace substrate
afbb506 pathwayd: HTTP service over internal/pathway · 11/11 smoke gate
f1c1883 vectord BatchAdd — single-lock variadic batch
71b35fb SPEC §1 + §3.4: name matrix indexer as a port target
a7620c8 PRD: name the product vision — small-model pipeline + 5-loop substrate
c1d96b7 matrixd: multi-corpus retrieve+merge — SPEC §3.4 component 2 of 5
166470f corpusingest: extract reusable text→vector ingest pipeline
0d1553c candidates corpus: first deep-field reality test on real staffing data
9588bd8 matrix relevance filter — SPEC §3.4 component 3 of 5
3968ec8 matrix strong-model downgrade gate — SPEC §3.4 component 4 of 5
a97881d workers corpus + multi-corpus reality test — matrix indexer end-to-end
31b4088 multi_corpus_e2e WORKERS_LIMIT knob + embed-text-not-sample-size finding
06e7152 matrix playbook memory + boost — SPEC §3.4 component 5 of 5 (LEARNING LOOP)
a730fc2 scrum fixes: 4 real findings landed, 4 false positives dismissed
7f42089 D: embed-text iteration — clean negative finding (3 variants tested)
57d0df1 E (partial): distillation port — scorer + contamination firewall
be65f85 F: drift quantification — scorer drift first
b199093 B: matrix metadata filter — post-retrieval structured gate
6392772 C: bulk playbook record — operational rating wiring
bc9ab93 H: observerd — autonomous-iteration witness loop (SPEC §2 port)
97dd3f8 SPEC §3.5/§3.6/§3.7/§3.8 — name F/B/C as port targets + Archon-style workflow runner
e30da6e §3.8 first slice: workflow runner skeleton + DAG executor + observerd integration
c7e3124 §3.8 second slice: real modes wired (matrix.relevance/downgrade/search, distillation.score, drift.scorer)

This is the wave that took the system from "G0+G2 substrate plus 500K validation" to "all five small-model-pipeline loops have at least a first port" (per project_small_model_pipeline_vision.md).


Score delta — double column

Same 6 dimensions, scored 010 with citations. Δ R1 = vs rerun-1 (4840c10); Δ Base = vs original audit (91edd43).

Dimension Base R1 R2 Δ R1 Δ Base Evidence for the move
Reproducibility 7 9 9 0 +2 just verify PASS in 31s wall (_evidence/rerun2/just_verify.log) — vet + 30 packages of go test -short + 9 core smokes. just doctor all-green for go/gcc/minio/ollama/secrets. 8 additional domain smokes also PASS (pathway, matrix, relevance, downgrade, observer, playbook, workflow, storaged_cap → _evidence/rerun2/smoke_*.log). New recipes: smoke-g2-fixtures (R-006 partial close) + smoke-storaged-cap. Still 1: no .github/workflows/; no fixture-mode for storage (only embed).
Test Coverage 6 8 9 +1 +3 321 Go test functions across 40 test files (was 13 at baseline, ~77 at R1 — 3× the test surface). internal/shared has 4 test files (auth_test.go, bind_test.go, config_test.go, server_test.go); internal/storeclient/client_test.go exists; internal/queryd/db_test.go + registrar_test.go exist — R-002 / R-003 / R-008 all closed. Six original cmd binaries now have main_test.go (catalogd/embedd/ingestd/queryd/storaged/vectord) — R-005 mostly closed. Still 1: cmd/{matrixd,observerd,pathwayd,fake_ollama}/main_test.go absent — three of those are new daemons that need wiring tests.
Trust Boundary Safety 7 7 9 +2 +2 ADR-003 shipped (docs/DECISIONS.md §3): internal/shared/auth.go 64-line Bearer middleware with constant-time compare via crypto/subtle + IP allowlist (internal/shared/auth.go:62-64). 4 auth tests in auth_test.go cover wrong-token, raw-token-without-prefix, IP-only, both-required (internal/shared/auth_test.go:77,86,108,162). redactCreds still scrubs S3 keys from queryd error chain (internal/queryd/db.go). One fmt.Sprintf SQL site remains (internal/queryd/registrar.go:153) — properly escaped via quoteIdent + sqlEscape. 13 MaxBytesReader sites in cmd/, 5 loopback bindings. Still 1: auth is opt-in (empty token = G0 dev mode); no CORS posture (R-010); 2 /home/profit/lakehouse/... paths in scripts/staffing_*/main.go flag-defaults.
Agent Memory Correctness 3 4 9 +5 +6 All five SPEC §3.4 components shipped: corpus builders (internal/corpusingest), retrieve+merge (matrixd /matrix/search), relevance filter (internal/matrix/relevance.go 376 LoC + 289 LoC test), strong-model downgrade gate (internal/matrix/downgrade.go 137 LoC + 100 LoC test), playbook memory + boost (internal/matrix/playbook.go 196 LoC + 180 LoC test) — including the learning loop. Pathway substrate ratified (ADR-004, internal/pathway/store.go 381 LoC + 398 LoC test). Mem0-style ops all proven: TestAdd_AssignsUIDAndTimestamps, TestUpdate_ReplacesContentSameUID, TestRevise_LinksToPredecessorViaHistory, TestRevise_ChainOfThree_BackwardWalk, TestRetire_ExcludedFromSearch, TestRetire_StillAccessibleViaGet, TestHistory_CycleDetected, TestHistory_PredecessorMissing_TruncatesChain, TestAddIdempotent_RejectsEmptyUIDevery Sprint 2 design-bar acceptance has a test. Observer ported (internal/observer/store.go 249 LoC + 193 LoC test). pathway smoke 11/11. Still 1: distillation port partial (scorer + firewall only — 57d0df1 "E (partial)"); drift is "scorer drift first" (be65f85) not full quantification.
Deployment Readiness 4 5 5 0 +1 just doctor actionable per-dep install (scripts/doctor.sh); just install-hooks documented; pre-push hook still installed. Still 5: no REPLICATION.md, no secrets-go.toml.example, no deploy/systemd/*.service, no Dockerfile, no readiness vs. liveness split. Sprint 4 stories all open.
Maintainability 8 8 9 +1 +1 4 ADRs ratified (was 1 at R1): ADR-001 foundational, ADR-002 storaged per-prefix cap, ADR-003 auth posture, ADR-004 pathway data model — the auth + cap + memory-model decisions are locked before downstream code retrofits them. Every binary still 100400 LoC (no god-files). Per-package test files: every internal/ package has ≥1 test file (was: 5 packages had zero at baseline). CLAUDE_REFACTOR_GUARDRAILS.md codifies the maintenance discipline. tests/proof/FINAL_REPORT.md answers the 9 mandated questions. Still 1: no CONTRIBUTING.md; the proof harness adds 24-claim maintenance surface that needs keeping current.

Composite: 35 → 43 → 50. 83% of max.


Code surface delta

Metric Baseline (91edd43) R1 (4840c10) R2 (c7e3124) Δ R1
Total Go LoC ~6,587 ~7,800 (est) 19,381 ~2.5×
Go files ~50 ~62 93 +31
Test files 13 ~22 40 +18
Go test functions ~77 ~109 321 +212
cmd/<bin>/ 7 7 12 +5
internal/<pkg>/ 11 11 18 +7
Smoke scripts 9 9 21 +12
ADRs ratified 0 1 4 +3
Routes (cmd-level) ~22 ~22 37 +15
Untested cmd binaries 6 / 7 6 / 7 4 / 12 2 abs, 1/3 ratio

The wave is substrate-bearing, not throughput-bearing. Every internal package has tests; the gap is now the wiring layer for the 3 new daemons.


Risk register status updates

12 risks in reports/scrum/risk-register.md. Status table at c7e3124:

Risk Severity Before R2 After R2 Evidence
R-001 queryd /sql RCE-eq off-loopback HIGH open partial 6af0520 fail-loud on non-loopback bind (closes worst case); ADR-003 + internal/shared/auth.go available to wrap; but auth is opt-in — needs deploy story decision before fully closing
R-002 internal/shared zero tests HIGH open CLOSED 4 test files (auth_test.go + bind_test.go + config_test.go + server_test.go), all PASS in just verify
R-003 internal/storeclient zero tests HIGH open CLOSED internal/storeclient/client_test.go, PASS
R-004 smokes not gated MED closed (R1) CLOSED unchanged from R1
R-005 6/7 cmd/main.go untested MED partial partial 6 of original 7 closed (0f79bce Batch 3); 4 new daemons (fake_ollama/matrixd/observerd/pathwayd) reopen the gap on different surface
R-006 no fixture-only smokes MED open partial scripts/g2_smoke_fixtures.sh (fb08232) closes embed half via fake_ollama; storage half deferred
R-007 zero auth middleware MED open partial internal/shared/auth.go shipped with 4 tests (fa56134); opt-in by default until deploy posture decision
R-008 queryd/db.go untested MED open CLOSED internal/queryd/db_test.go + registrar_test.go (125e1c8)
R-009 registrar.go fmt.Sprintf SQL LOW open open unchanged — escaping via quoteIdent+sqlEscape is correct, regression test still missing
R-010 no CORS posture LOW open open unchanged — no Access-Control-* headers anywhere
R-011 g2 smoke model assertion LOW note note unchanged
R-012 empty tests/ dir LOW closed (R1) CLOSED unchanged from R1

Net since R1: 3 closed (R-002, R-003, R-008), 3 advanced to partial (R-001, R-006, R-007), R-005 stays partial on different surface, 3 unchanged.


Sprint backlog progress

Sprint 0 — Reproducibility Gate

Story R1 R2
S0.1 just doctor DONE DONE
S0.2 just smoke-fixtures open partial (smoke-g2-fixtures)
S0.3 just verify + pre-push DONE DONE
S0.4 cmd/<bin>/main_test.go × 6 partial partial → mostly DONE (6 of original 7; 3 new daemons absent)
S0.5 internal/shared, storeclient, queryd/db tests open DONE
S0.6 tests/ dir cleanup DONE DONE

4 of 6 done, 2 partial. Highest-leverage open work: tests for the 3 new daemons + storage-half of fixture mode.

Sprint 1 — Trust Boundary Gate

  • Replace SQL string interp with parameterized: still 1 site, properly escaped (R-009 LOW)
  • Observer fail-open → degraded/cycle: not yet codified — observer is ported but ADR-002-style fail-safe ADR not written
  • Auth/localhost-only guardrails: shipped (ADR-003 + auth.go), opt-in posture
  • Schema validation per public endpoint: per-handler validation exists (validateKey etc.); not framework-level

Status: ~60% of Sprint 1 closed, observer fail-safe semantics ADR is the outstanding doc-only piece.

Sprint 2 — Memory Correctness Gate

Story R1 R2
ADD/UPDATE/REVISE/RETIRE/HISTORY tests design-bar DONE (internal/pathway/store_test.go)
Cycle detection tests design-bar DONE (TestHistory_CycleDetected)
Retired-trace exclusion tests design-bar DONE (TestRetire_ExcludedFromSearch)
Duplicate trace replay_count tests design-bar partial (TestAddIdempotent_RejectsEmptyUID; replay_count semantics)
Corrupted memory row recovery test design-bar open

Status: Sprint 2 acceptance criteria mostly green — the core invariants are tested. Audit/event receipt on every memory mutation is the missing piece.

Sprint 3 — Agent Loop Reality Gate

  • Deterministic mini corpus: tests/proof/fixtures/ exists
  • search → verify → observer review → playbook seal → second-run retrieval: scripts/multi_corpus_e2e.sh + scripts/playbook_smoke.sh exercise this; full chain via scripts/workflow_smoke.sh
  • Negative case observer rejects hallucinated claim: covered by observer_smoke (semantics open for review)
  • Health endpoint content-type regression: covered by proof harness 00_health

Status: Sprint 3 has working substrate; explicit "single command proves the full loop" with input/output/verdict/receipt evidence is partial.

Sprint 4 — Deployment Gate

Status: unchanged from R1. No REPLICATION.md, no .env.example, no *.service units, no Dockerfile. just doctor is the closest piece. This is the largest open Sprint.


New findings from this rerun

Two real findings worth recording.

F1 — 3 new daemons lack cmd/<bin>/main_test.go

  • Where: cmd/matrixd/, cmd/observerd/, cmd/pathwayd/
  • What: Same gap-class as R-005 was, just on net-new code. Each daemon mounts ≥4 routes (matrixd: 6, observerd: 4, pathwayd: 9 → 19 routes total) with no wiring test.
  • Severity: MEDIUM. The internal packages backing each daemon (internal/matrix, internal/observer, internal/pathway) have full unit tests — but no test proves cmd/pathwayd/main.go actually wires /pathway/revise to (*pathway.Store).Revise. A handler-rename refactor would silently break the route surface.
  • Action: Re-open R-005 against the new daemons. ~1 hr to add three main_test.go files patterned on cmd/storaged/main_test.go.

F2 — scripts/staffing_*/main.go has hardcoded data paths in flag defaults

  • Where: scripts/staffing_candidates/main.go:217 and scripts/staffing_workers/main.go:269 reference /home/profit/lakehouse/data/datasets/{candidates,workers_500k}.parquet.
  • What: Flag defaults reach into the Rust legacy tree at /home/profit/lakehouse/.... Throwaway driver scripts (not services), and the values are flag-overridable, but they couple the Go repo to the Rust filesystem layout.
  • Severity: LOW. Doesn't affect any service. Worth noting because audit Sprint 4 explicitly calls out "no hardcoded /home/profit paths" as an acceptance criterion.
  • Action: Either move the parquet under golangLAKEHOUSE/data/ (preferred for self-containment) or document the cross-tree dependency in RESEARCH_LOG_2026-04-28.md and accept it.

What this rerun does NOT change

  • Sprint 4 (deployment) remains the largest open gap. R-1 said this; R-2 says this; without REPLICATION.md + systemd units, the cutover from Rust at devop.live/lakehouse/ (G5) cannot be operator-validated.
  • Auth is opt-in. Empty-token default is fine for G0 development but means the moment any Go binary binds non-loopback in prod, a posture decision is required. R-001 + R-007 cannot fully close until that decision is recorded.
  • CORS posture (R-010) is still unspecified. The Bun-served Rust UI handles browser CORS today; if a Go service ever fronts a browser, this needs a decision.
  • Distillation and drift are first-port-only. 57d0df1 ships scorer + contamination firewall (E partial); be65f85 ships scorer-drift only (F first slice). The full distillation pipeline (sample export, audit_baselines lineage) and full drift signal are not yet ported.

  1. Three main_test.go files for matrixd + observerd + pathwayd (~1 hr). Closes the regenerated R-005, ratchets every future route addition through just verify.
  2. ADR-005: observer fail-safe semantics (~30 min, doc-only). The observer is ported (internal/observer/store.go), but the upstream "verdict:accept on crash" anti-pattern still has no Go-side decision locked. Doing this now is half the cost of doing it after a regression.
  3. Auth posture decision for non-loopback deploy (~1 hr, ADR or annotated decision in RESEARCH_LOG). Locks R-001 + R-007 from "opt-in middleware exists" to "wired-by-default for X, opt-in for Y". Required input for any G5 cutover plan.
  4. Sprint 4 minimal first slice (~3 hr): secrets-go.toml.example + deploy/systemd/<bin>.service.tmpl × 12 binaries + REPLICATION.md skeleton. Highest-leverage Sprint 4 starter; the systemd units mostly mirror Rust's layout.
  5. Storage-half of fixture mode (~3 hr): MockS3Storage interface satisfying internal/storaged.Bucket, smoke variant that points storaged at it. Closes R-006 fully and decouples CI from MinIO.

The remaining items (full drift port, full distillation port, observer audit-event receipt, corrupted-memory recovery test) are real engineering — Sprint 2/3 followups, not Sprint-0 polish.


Methodology note — same as prior reports

All claims cite a file, line, or command. Evidence captured under reports/scrum/_evidence/rerun2/:

  • just_verify.log — full vet + 30 packages × go test -short + 9 core smokes, exit 0, 31s wall
  • just_doctor.log — 5 dependency probes, all green
  • govet.loggo vet ./... exit 0
  • gotest_short.log — full short-test pass
  • just_list.log — recipe inventory
  • smoke_{pathway,matrix,relevance,downgrade,observer,playbook,workflow,storaged_cap}.log — 8 additional domain smokes, all PASS

What was NOT inspected this round (deferred):

  • Cross-binary failure cascades (kill matrixd mid-search, observe observerd state) — Sprint 1 follow-up
  • Supply-chain audit of go.sum diffs since R1
  • Performance regression vs the perf baseline shipped in 1ec85b0just proof performance exists, not run here

Rerun-2 produced under the same "no vibes" rule as the original audit. The 50/60 reflects what's verifiably shipped at c7e3124, not what's planned. Working tree restored from stash after audit completion.