History

profit efc7b5ac44 Auditor: dynamic + inference checks

auditor/checks/dynamic.ts — wraps runHybridFixture, maps layer
results to Findings. Placeholder-style errors (404/unimplemented/
slice N) → info; other failures → warn. Always emits a summary
finding with real numbers (shipped/placeholder phase counts + per-
layer latency). Live-tested against current stack: 2 info findings,
0 warnings — all shipped layers actually work.

auditor/checks/inference.ts — wraps the run_codereview reviewer
pattern from llm_team_ui.py, adapted for claim-vs-diff verification.
Calls /v1/chat provider=ollama_cloud model=gpt-oss:120b. Requests
strict JSON response with claim_verdicts[] and unflagged_gaps[]. A
strong claim marked "not backed" by cloud → BLOCK severity; moderate
→ warn; weak → info. Cloud-unreachable or unparseable-output → info
(never blocks on the reviewer being down).

Live-tested against PR #1 (this PR, 20 claims, 39KB diff):
  - 36.9s round-trip
  - 7 block + 23 warn + 2 info findings
  - gpt-oss:120b correctly flagged "Fully-functional auditor (tasks
    1-9 complete)" as not-backed (only 6/10 tasks done at that
    commit) — accurate catch
  - Some false positives from the original 15KB truncation threshold
    (cloud missed gitea.ts, flagged "no Gitea client present")
  - Bumped MAX_DIFF_CHARS from 15000 to 40000 to fit the full PR
    diff in context; reviewer precision improves accordingly

Tasks 5 + 6 completed. Remaining: #7 (KB query), #8 (verdict +
Gitea poster), #9 (poller), #10 (end-to-end proof), #12 (upsert
UPDATE-drops-doc_refs).

2026-04-22 03:54:18 -05:00

checks

Auditor: dynamic + inference checks

2026-04-22 03:54:18 -05:00

fixtures

Fixture: unique-per-run nonce eliminates state-pollution false positive

2026-04-22 03:50:46 -05:00

claim_parser.ts

Auditor: claim parser

2026-04-22 03:28:06 -05:00

gitea.ts

Auditor scaffold: types + Gitea client + policy stub + README

2026-04-22 03:26:56 -05:00

policy.ts

Auditor scaffold: types + Gitea client + policy stub + README

2026-04-22 03:26:56 -05:00

README.md

Auditor scaffold: types + Gitea client + policy stub + README

2026-04-22 03:26:56 -05:00

types.ts

Auditor scaffold: types + Gitea client + policy stub + README

2026-04-22 03:26:56 -05:00

README.md

Lakehouse Claim Auditor

A Bun sub-agent that watches open PRs on Gitea, reads the ship-claims in commit messages and PR bodies, and hard-blocks merges when the code doesn't back the claim.

Rationale: when "compiles + one curl works" gets called "phase shipped," placeholder code accumulates. This auditor runs every 90s, fetches each open PR, and subjects it to four checks:

Static diff — grep/parse looking for placeholder patterns
Dynamic — runs the never-before-executed hybrid test fixture
Cloud inference — asks gpt-oss:120b via /v1/chat to identify gaps in the diff
KB query — looks up data/_kb/ + observer for prior failure patterns on similar claims

Verdict is assembled, posted to Gitea as:

A failing commit status (hard block — branch protection prevents merge)
A review comment explaining every finding

Run manually

cd /home/profit/lakehouse
bun run auditor/index.ts

Defaults: polls every 90s, stops on auditor.paused file present.

State

data/_auditor/state.json — last-audited head SHA per PR
data/_auditor/verdicts/{pr}-{sha}.json — per-run verdict record

Where YOU edit

auditor/policy.ts — the verdict assembler. Controls which findings block vs warn vs inform. All other code is mechanical: fetching, running checks, posting to Gitea.

Hard-block mechanism

Commit status is posted as failure with context lakehouse/auditor
If main branch protection requires lakehouse/auditor status to pass, Gitea prevents merge
When code is fixed and re-audit passes, status flips to success, merge unblocks

Enable branch protection (one-time, via Gitea UI or API):

POST /repos/profit/lakehouse/branch_protections
{"branch_name": "main", "required_status_checks": {"contexts": ["lakehouse/auditor"]}}