lakehouse

History

profit 1e00eb4472 auditor: inference temp=0, think=false — kill signature creep

9-run empirical test showed 20 of 27 audit_lessons signatures were
singletons (count=1) — the cloud producing slightly-different summary
phrasings for the SAME underlying claim on each audit, each hashing
to a fresh signature. That's the creep J flagged — not explosive,
but steady ~2 new sigs per run, unbounded over hundreds of runs.

Root cause: temperature=0.2 + think=true was letting variable prose
leak into the classification output. Fix: temp=0 (greedy sample →
identical input yields identical output on same model version),
think=false (no reasoning trace variance), max_tokens 3000→1500
(tighter bound prevents tail wander).

The compounding policy itself was validated by the 9 runs:
  - 7 recurring claims (the legitimate signals) all at conf 0.08-0.20
  - ratingSeverity() correctly held them at info (below 0.3 threshold)
  - cross-PR signal test separately confirmed conf=1.00 → sev=block

Also: LH_AUDIT_RUNS env so the test can validate with smaller N.

2026-04-22 22:09:35 -05:00

enrich_prd_pipeline.ts

tests/real-world: add task-level 6-retry loop (per J 2026-04-22)

2026-04-22 17:50:53 -05:00

hard_task_escalation.ts

tests/real-world: hard-task escalation — prove the ladder solves tasks local can't

2026-04-22 18:50:53 -05:00

nine_consecutive_audits.ts

auditor: inference temp=0, think=false — kill signature creep

2026-04-22 22:09:35 -05:00

scrum_master_pipeline.ts

scrum_master: tree-split + scrum_reviews.jsonl writer + truncation warning

2026-04-22 21:17:53 -05:00