lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Three fixes after the playbook_only confabulation surfaced in
2026-04-26 experiment (8 'findings' on a 333-line file all citing
lines 378-945 — fully fabricated from pathway-memory pattern names).
(1) Aggregator regex bug — section detection failed on emoji-prefixed
markdown headers like `## 🔎 Ranked Findings`. The original regex
required word chars right after #{1,3}\s+, so the patches table
header `## 🛠️ Concrete Patch Suggestions` was never recognized as
a stop boundary, double-counting every finding. Fix: allow
non-letter chars (emoji/space) between # and the keyword.
(2) Grounding check — for each finding row in the response, extract
backtick-quoted symbols + cited line numbers; verify symbols exist
in the actual focus file and lines fall within EOF. Computes
grounded/total ratio per mode. Surfaces 'OOB' (out-of-bounds) count
explicitly so confabulation is visible at a glance. Confirms what
hand-grading found: codereview_playbook_only's 8 findings on
service.rs were 1/8 grounded with 7 OOB.
(3) Control mode tagging — codereview_null and codereview_playbook_only
are designed as falsifiers (baseline / lossy ceiling) and their
numerical wins should never be read as recommendations. Output
marks them with ⚗ glyph + warning footer.
Per-mode aggregate is now sorted by groundedness, not raw count.
Per-mode-vs-lakehouse comparison uses grounded findings, not raw —
so confabulation can no longer score a "win".
Updated SCRUM_MASTER_SPEC.md with refactor timeline pointing at
the 2026-04-25/26 commits (observer fix, relevance filter, retire
wire, mode router, enrichment runner, parameterized experiment).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
The matrix-agent-validated repo split + Ansible deploy pipeline are
otherwise undocumented in this repo. This new doc explains:
- the relationship between lakehouse and matrix-agent-validated
- where the playbook lives and what it provisions
- the critical distinction: matrix-test (10.111.129.50 Incus container)
is the REAL destination, while 192.168.1.145 is a smoke-test VPS only
(partial deploy, no services, do not treat as production)
- what landed today (observer fix, HANDOVER.md render, relevance filter)
Added a cross-link block at the top of SCRUM_MASTER_SPEC.md so the
canonical scrum handoff doc points at the new MATRIX_AGENT_HANDOVER doc.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Fresh-session artifact so work is recoverable if the branch is reopened
in a new Claude Code session without context. Covers:
- 9-rung ladder (kimi-k2:1t through local qwen3.5:latest)
- tree-split reducer (files >6KB sharded + map→reduce)
- schema_v4 KB rows in data/_kb/scrum_reviews.jsonl
- auto-applier 5 hardened gates (confidence, size, cargo-green,
warning-count, rationale-diff)
- pathway_memory (ADR-021) — narrow fingerprint + hot-swap gate +
semantic-correctness layer (SemanticFlag, BugFingerprint)
- HTTP surface on gateway (/vectors/pathway/*)
- current state (12 traces, 11 fingerprints, 0 hot-swaps — probation)
- commit history on scrum/auto-apply-19814 since iter-5 baseline
- how-to-run (env vars, service restarts)
- where things live (code pointers table)
- known gotchas (LLM Team mode registry, restart requirements)
Paired updates (not in this commit, live outside the repo):
- /home/profit/CLAUDE.md — active workstream pointer + notes
- /root/.claude/skills/read-mem/SKILL.md — SCRUM_MASTER_SPEC.md added
to the loading list + ADR-021 glossary
- memory/project_scrum_pipeline.md — updated with iter-9 state
- memory/feedback_semantic_correctness_via_matrix.md — updated with
end-to-end proof evidence
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>