All checks were successful
lakehouse/auditor all checks passed (3 findings, all info)
Three artifacts in one PR:
1. docs/PYTHON_INVENTORY.md — every .py file in the repo classified:
Production (sidecar routers + 3 systemd services), Documented
(kb_measure, kb_staffer_report), Manual (one-off tools), Dead
(sidecar/sidecar/lab_ui.py + pipeline_lab.py are genuinely
not imported anywhere).
2. docs/COHESION_INTEGRATION_PLAN.md — the "smarter DB" loop J
called out as missing. Six phases A-F. Phase A ships here; B-F
are named + sequenced for follow-up PRs. Each phase adds ONE
wire of the loop; no single PR does them all.
3. Phase A wire (auditor verdicts → observer + KB):
- auditor/audit.ts: after assembleVerdict, fire-and-forget POST
to :3800/event with source="auditor" AND append to
data/_kb/outcomes.jsonl with kind="audit". Errors log + drop
— the verdict is still on disk at _auditor/verdicts/.
- mcp-server/observer.ts: extend source union to include
"auditor" | "bot" (was "mcp" | "scenario" only, which silently
coerced my first auditor POST to source="scenario"). Accept
body.ok OR body.success. Accept body.audit_duration_ms as a
fallback for duration_ms. Uses body.one_liner as
output_summary when set.
Live-verified after observer restart:
re-audit PR #6 → verdict=request_changes, 4 findings (1 warn)
observer: by_source={'auditor': 1} (previously coerced to 'scenario')
_kb/outcomes.jsonl tail: kind=audit sig=pr6-7fe47bab
pr=6 overall=request_changes
The shape of the loop is now visible to downstream consumers. Phase
B (auditor's kb_query check reads these audit rows for history)
lands in a follow-up PR. Phase C-F similar.
NOT in this PR:
- Actually deleting lab_ui.py + pipeline_lab.py (operator decision,
called out in the inventory doc)
- Cleaning up the 5 overlapping Python scripts (same)
- Phases B-F of the cohesion plan (separate PRs per wire)
- Integration test that asserts "smarter DB" across runs (Phase F)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lakehouse Claim Auditor
A Bun sub-agent that watches open PRs on Gitea, reads the ship-claims in commit messages and PR bodies, and hard-blocks merges when the code doesn't back the claim.
Rationale: when "compiles + one curl works" gets called "phase shipped," placeholder code accumulates. This auditor runs every 90s, fetches each open PR, and subjects it to four checks:
- Static diff — grep/parse looking for placeholder patterns
- Dynamic — runs the never-before-executed hybrid test fixture
- Cloud inference — asks
gpt-oss:120bvia/v1/chatto identify gaps in the diff - KB query — looks up
data/_kb/+ observer for prior failure patterns on similar claims
Verdict is assembled, posted to Gitea as:
- A failing commit status (hard block — branch protection prevents merge)
- A review comment explaining every finding
Run manually
cd /home/profit/lakehouse
bun run auditor/index.ts
Defaults: polls every 90s, stops on auditor.paused file present.
State
data/_auditor/state.json— last-audited head SHA per PRdata/_auditor/verdicts/{pr}-{sha}.json— per-run verdict record
Where YOU edit
auditor/policy.ts — the verdict assembler. Controls which findings
block vs warn vs inform. All other code is mechanical: fetching,
running checks, posting to Gitea.
Hard-block mechanism
- Commit status is posted as
failurewith contextlakehouse/auditor - If
mainbranch protection requireslakehouse/auditorstatus to pass, Gitea prevents merge - When code is fixed and re-audit passes, status flips to
success, merge unblocks
Enable branch protection (one-time, via Gitea UI or API):
POST /repos/profit/lakehouse/branch_protections{"branch_name": "main", "required_status_checks": {"contexts": ["lakehouse/auditor"]}}