3 Commits

Author SHA1 Message Date
root
47776b07cd auditor: 2 fixes from kimi_architect on ebd9ab7 audit
The auditor's own audit on commit ebd9ab7 produced 10 kimi_architect
findings; 2 are real correctness issues that this commit lands. The
other 8 are documented in the commit body as triaged-skip with
rationale (false flags, defensible by current intent, or edge cases).

LANDED:

1. auditor/index.ts — atomic state mutation on audit count.
   `state.audit_count_per_pr[prKey] += 1` was held in memory until
   the cycle's saveState at the end. If the daemon was killed mid-
   cycle (SIGTERM, OOM, panic), the count was lost on restart while
   the on-disk last_audited still showed the SHA as audited — the cap
   silently leaked one audit per crash. Fix: persist state immediately
   after each successful audit so the increment survives a crash.
   saveState is idempotent + cheap (single JSON write); per-audit
   cost negligible.

2. auditor/checks/inference.ts — Number-coerce mode runner telemetry.
   `body?.latency_ms ?? 0` collapses null/undefined but passes through
   non-numeric values (string, NaN, etc.) which would poison downstream
   arithmetic in maxLatencyMs computation. Added a `num(v)` helper
   that does `Number(v)` with `isFinite` fallback to 0. Applied to
   latency_ms, enriched_prompt_chars, bug_fingerprints_count,
   matrix_chunks_kept.

SKIPPED with rationale:

- WARN kimi_architect.ts:211 "metrics appended even on empty verdict":
  this is intentional — observability shouldn't depend on whether
  parseFindings succeeded. Comment in the file explicitly notes this.
- WARN static.ts:270 "escaped-backslash-before-backtick edge case":
  real but extremely narrow (Rust raw strings with `\\\\\``). No
  observed false positives in production audits; defer.
- INFO kimi_architect.ts:333 "sync existsSync in async fn": existsSync
  is non-blocking syscall on Linux; not a real perf hit at audit
  scale (10s of findings per call).
- INFO kimi_architect.ts:105 "audit_index modulo wraparound at 50+
  audits": cap=3 means we never reach high counts on any PR.
- INFO inference.ts:366 "prompt injection delimiter risk": OUTPUT
  FORMAT delimiter is in our prompt template, not user input; user
  data goes inside content sections that don't contain the delimiter.
- WARN Cargo.lock:8739 "truth+validator no Cargo.toml in diff":
  false flag — Cargo.toml IS in workspace members (lines 17-18 of
  the workspace manifest).
- WARN config/modes.toml:1 "no schema validation": defensible — the
  load path validates structure (deserialize_string_or_vec at
  mode.rs:175) and falls back to safe default on parse error.
- INFO evidence_record.ts:124 "metadata accepts any keys": values are
  constrained to `string | number | boolean`; key-name validation
  not warranted for a domain-metadata field.

The 13 BLOCK-severity inference findings on this audit are all
"claim not backed" against historical commit messages from earlier
in the branch (8aa7ee9, bc698eb, 5bdd159, etc.). Those are
aspirational prose ("Verified end-to-end") that the deepseek
consensus can't verify from a static diff — known limitation, not
actionable as code fixes.

Verification:
  bun build auditor/index.ts                     compiles
  bun build auditor/checks/inference.ts          compiles
  systemctl restart lakehouse-auditor            active

Cap remains active on PR #11 (3/3) — daemon will not audit this
fix-commit. Reset state.audit_count_per_pr.11 to verify the fixes
land clean on a fresh audit when ready.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 07:45:40 -05:00
root
dc6dd1d30c auditor: per-PR audit cap (default 3) — daemon halts further audits until reset
Adds MAX_AUDITS_PER_PR (env LH_AUDITOR_MAX_AUDITS_PER_PR, default 3).
The poller increments a per-PR counter on each successful audit; when
the counter reaches the cap it skips that PR with a "capped" log line
until the operator manually clears state.audit_count_per_pr[<PR#>].

Why:
"I don't want it to continuously loop even if it finds a problem.
We need a maximum until we can come back."

Without this, the daemon polls every 90s and audits every new head
SHA. If each fix-commit surfaces new findings (which is what
kimi_architect is designed to do), the audit loop runs unbounded
while the operator is away. At ~$0.30/audit on Opus and 5-10 pushes
a day, that's $1-3/day idle burn — fine for a couple days, painful
for weeks.

Cap mechanics:
- Counter starts at 0 per PR (or whatever exists in state.json)
- Increments only on successful audit (failures don't count)
- Comparison is >= so cap=3 means audits 1, 2, 3 run; 4+ skip
- Skip is logged: "capped at N/M audits — clear state.json
  audit_count_per_pr.<N> to resume"
- New `cycles_skipped_capped` counter on State for observability

Reset:
  jq '.audit_count_per_pr = (.audit_count_per_pr - {"11": 4})' \
    /home/profit/lakehouse/data/_auditor/state.json > /tmp/s.json && \
    mv /tmp/s.json /home/profit/lakehouse/data/_auditor/state.json
- Daemon picks up the change on the next cycle (no restart needed —
  state is reloaded each cycle)
- Or set the entry to 0 if you want to keep the key

Disable cap: LH_AUDITOR_MAX_AUDITS_PER_PR=0
Reduce cap: LH_AUDITOR_MAX_AUDITS_PER_PR=1   (one audit per PR head, then pause)

Pre-existing PR audits today (4 on PR #11) are NOT seeded into the
counter by this commit — operator decides post-deploy whether to set
state.audit_count_per_pr.11 to today's actual count or leave at 0.
Setting to 4 (or 3) immediately halts further audits on PR #11.

Verification:
  bun build auditor/index.ts   compiles
  systemctl restart lakehouse-auditor   active

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 07:24:23 -05:00
profit
c33c1bcbc5 Auditor: poller + live end-to-end proof
All checks were successful
lakehouse/auditor all checks passed (4 findings, all info)
auditor/index.ts (task #9) — the top-level poller. 90s interval,
dedupes by head SHA via data/_auditor/state.json, supports --once
for CLI testing. Env gates: LH_AUDITOR_RUN_DYNAMIC=1 to include
the hybrid fixture (default off; it mutates live state),
LH_AUDITOR_SKIP_INFERENCE=1 for fast runs without cloud calls.

Single-shot run proof (task #10):

  cycle 1: 2 open PRs
    audit PR #2 f0a3ed68 "Fix: UpsertOutcome newtype serde panic"
       verdict=block, 9 findings (1 block, 5 warn, 3 info)
    audit PR #1 039ed324 "Auditor: PR-claim hard-block reviewer"
       verdict=approve, 4 findings (0 block, 0 warn, 4 info)
    audits_run=2, state persisted

Commit statuses and issue comments posted live to Gitea. PR #2 is
currently hard-blocked (lakehouse/auditor commit status = failure);
PR #1 has a passing status. State survives restart — next cycle
skips already-audited SHAs.

Both PRs now have the audit comment with per-check breakdown.
Operator can read the comment, fix blocking findings (or defend
them with a reply), push a new commit; auditor re-audits on new
SHA, verdict updates, merge gate responds accordingly.

The full loop J asked for is closed:
  1. static check caught own Phase 45 placeholder (b933334)
  2. hybrid fixture caught UpsertOutcome serde panic (9c893fb)
  3. LLM-Team-style codereview caught ternary bug (5bbcaf4)
  4. auditor poller now runs on every open PR, block/approve with
     evidence, re-audits on new SHAs

Tasks done: 1-11 (except 12, a scoped follow-up fix for UPDATE
branch dropping doc_refs). The auditor is running, catching real
bugs in its own build, and gating merges.
2026-04-22 04:02:36 -05:00