# Distillation System — Recovery Runbook **Version:** v1.0.0 **Audience:** Future operator (or future Claude session) inheriting this system in a broken state. This is the failure-mode runbook. Read top-down, stop at the first symptom that matches. ## Symptom 1: `audit-full` exits non-zero A required check failed. The report at `reports/distillation/phase8-full-audit-report.md` will name the failing check verbatim. Map by phase: ### P0 — recon doc missing **Cause:** repo state corrupted; the recon doc has never existed at this commit. **Fix:** ```bash git checkout distillation-v1.0.0 -- docs/recon/local-distillation-recon.md ``` ### P0 — tier-1 source stream missing **Cause:** fresh-clone or post-rotation environment without source data. **Severity:** informational only (audit-full reports as required=false). Pipeline will produce 0 rows but won't fail. **Fix:** populate the source stream OR accept reduced output and note in the next report. ### P1 — schema validators fail **Cause:** somebody modified a Phase 1 schema and broke a validator. **Diagnostic:** ```bash bun test auditor/schemas/distillation/ 2>&1 | grep -A2 "fail" ``` **Fix:** revert the schema change. Phase 1 schemas are versioned (`_SCHEMA_VERSION = 1`); they bump deliberately, never silently. ```bash git diff distillation-v1.0.0 -- auditor/schemas/distillation/ # see what changed git checkout distillation-v1.0.0 -- auditor/schemas/distillation/ ``` ### P2 — materializer dry-run fails / writes 0 **Cause:** every source jsonl is empty OR every transform is broken. **Diagnostic:** ```bash ls -la data/_kb/*.jsonl # confirm sources have content bun run scripts/distillation/build_evidence_index.ts --dry-run # inspect skip reasons in data/_kb/distillation_skips.jsonl ``` **Fix:** identify the broken transform via per-source row counts. If a real-data shape changed (e.g. a new field name in a source jsonl), update the matching transform in `scripts/distillation/transforms.ts`. Add a fixture row to `auditor/schemas/distillation/realdata.test.ts` covering the new shape. ### P3 — scored-runs distribution empty **Cause:** `data/scored-runs/` is missing OR the score categories are not landing. **Fix:** ```bash ./scripts/distill score # re-runs scorer if data/evidence/ is populated ``` If still empty, the scorer rules in `scripts/distillation/scorer.ts` may have changed. Re-run `bun test tests/distillation/scorer.test.ts` — if those fail, revert. ### P4 — SFT contamination firewall caught a leak (CRITICAL) **Severity:** alert. A `rejected` or `needs_human_review` row landed in SFT. This is a non-negotiable spec violation. **Stop immediately:** ```bash mv exports/sft/instruction_response.jsonl exports/sft/instruction_response.jsonl.QUARANTINED-$(date +%s) ``` **Diagnostic:** the audit report's "P4 SFT contamination firewall" check shows the count. The forbidden row is in the file you just renamed: ```bash jq 'select(.quality_score != "accepted" and .quality_score != "partially_accepted")' exports/sft/instruction_response.jsonl.QUARANTINED-* ``` **Root cause is one of:** 1. SftSample schema validator was loosened — check `auditor/schemas/distillation/sft_sample.ts` against v1.0.0 2. Exporter `SFT_NEVER` constant was loosened — check `scripts/distillation/export_sft.ts` 3. Schema validation was bypassed (someone called `appendFileSync` directly) — find the offending caller via `git log --oneline -p exports/sft/` **Recovery:** revert the offending change. Re-run `./scripts/distill export-sft`. The fresh output should pass the audit's leak check. ### P4 — Preference self-pair leaked **Cause:** export_preference's pairing logic produced `chosen_run_id == rejected_run_id`. Schema validator should catch this. **Diagnostic:** ```bash jq 'select(.chosen_run_id == .rejected_run_id) | .id' exports/preference/chosen_rejected.jsonl ``` **Fix:** check `scripts/distillation/export_preference.ts::buildPair` — the equality guard at the top must remain. Revert if missing. ### P5 — RunSummary fails to validate **Cause:** receipts harness emitted a malformed summary. **Diagnostic:** ```bash LATEST=$(ls -1t reports/distillation/*/summary.json | head -1) bun -e "import {validateRunSummary} from './auditor/schemas/distillation/run_summary'; const s = JSON.parse(await Bun.file('$LATEST').text()); console.log(validateRunSummary(s))" ``` **Fix:** the field that failed validation is named in the validator output. Either fix the harness in `scripts/distillation/receipts.ts` or revert. ### P6 — acceptance gate fails an invariant **Cause:** something in Phases 1-5 changed in a way the fixture catches. **Diagnostic:** run acceptance directly to see all 22 checks: ```bash bun run scripts/distillation/acceptance.ts # scan output; the failing check names what broke cat reports/distillation/phase6-acceptance-report.md ``` **Fix:** the report's "Failures" section names the invariant. The 22 invariants are documented at the top of `scripts/distillation/acceptance.ts`. Locate the affected phase code, revert. ### P7 — replay validation regressed **Cause:** local model output failed the structural validator OR retrieval found 0 playbooks. **Diagnostic:** ```bash ./scripts/distill replay --task "" --local-only # inspect the validation_result.reasons ``` **Fix:** - If validation reasons mention "filler/hedge phrase": the local model regressed (model swap?) — revert `LH_REPLAY_LOCAL_MODEL` to default - If retrieval is empty: `exports/rag/playbooks.jsonl` is empty — re-run export-rag ## Symptom 2: drift table flags `warn` A metric moved >20% from baseline. This is a SOFT alert — not a failure, but worth investigating before treating outputs as stable. **Diagnostic:** ```bash jq -r '.metrics' data/_kb/audit_baselines.jsonl | tail -5 # see the recent baseline trajectory cat reports/distillation/phase8-full-audit-report.md # read the drift table ``` **Common causes:** - New source data → record counts grow → expected, not a regression - Scorer rules changed → category distribution shifted → confirm intentional - Exporter filter loosened → SFT/RAG counts grow → CHECK contamination firewall first **If the drift is intentional**, write a row to `data/_kb/audit_baselines.jsonl` documenting why (no schema for this — just append a JSON line with a `notes` field). Future audits will treat the new value as the new baseline. ## Symptom 3: `acceptance` exits non-zero but `audit-full` doesn't This is rare — acceptance is stricter (22 invariants on a fixture vs audit-full's 16 required checks on real data). The failing acceptance check usually points to a broken assumption that real data hides. **Diagnostic:** ```bash bun run scripts/distillation/acceptance.ts 2>&1 | head -50 ls /tmp/distillation_phase6_acceptance/data/evidence/ # the failed run leaves its temp root ``` **Fix:** the acceptance script keeps the temp root on fail (cleans only on pass). Inspect `/tmp/distillation_phase6_acceptance/` to see what the pipeline produced vs expected. The fixture rows themselves are the contract — change the fixture deliberately, never to make a test pass. ## Symptom 4: `run-all` produces empty exports **Cause:** Phase 2 or Phase 3 ran on empty input. **Diagnostic order:** 1. `ls data/_kb/*.jsonl` — sources present? 2. `find data/evidence -name "*.jsonl" | xargs wc -l` — any rows materialized? 3. `find data/scored-runs -name "*.jsonl" | xargs wc -l` — any rows scored? 4. `wc -l data/_kb/distillation_skips.jsonl data/_kb/scoring_skips.jsonl` — anything skipped? The first counter that's 0 names the broken phase. ## Symptom 5: hash mismatch on identical input **Cause:** determinism violation. Same input → different output. This is a CRITICAL bug. **Diagnostic:** ```bash # Wipe outputs but keep sources, run twice with same recorded_at, compare run_hash rm -rf data/evidence data/scored-runs exports RA="2026-04-27T00:00:00.000Z" ./scripts/distill run-all # captures run_id_1 LATEST_1=$(ls -1t reports/distillation/ | grep -v phase | head -1) rm -rf data/evidence data/scored-runs exports ./scripts/distill run-all LATEST_2=$(ls -1t reports/distillation/ | grep -v phase | head -1) jq .run_hash reports/distillation/$LATEST_1/summary.json jq .run_hash reports/distillation/$LATEST_2/summary.json # These MUST match if recorded_at is fixed. ``` **If they don't match:** something in the pipeline introduced non-determinism. Common causes: - A `Date.now()` baked into output (other than the explicit `recorded_at`) - `Math.random()` or `randomUUID()` in a path that should be deterministic - A `Map` iteration order issue (rare in V8) - Concurrent writes to the same file Bisect against `distillation-v1.0.0` — find the commit that introduced the non-determinism, revert. ## Symptom 6: replay logs growing unbounded Phase 7 replay appends to `data/_kb/replay_runs.jsonl` with no rotation. Acceptable until file >100MB, then: ```bash # Move and start fresh mv data/_kb/replay_runs.jsonl data/_kb/replay_runs.archive.$(date +%s).jsonl gzip data/_kb/replay_runs.archive.*.jsonl ``` This is documented as a Phase 7 carry-over. A future Phase 10+ could add rotation. ## Last resort: nuclear restore If nothing works: ```bash git fetch --tags git stash # save uncommitted work git checkout distillation-v1.0.0 ./scripts/distill audit-full # confirm 16/16 pass at v1.0.0 ./scripts/distill acceptance # confirm 22/22 pass # Now diff against the broken state to see what changed git diff distillation-v1.0.0..scrum/auto-apply-19814 -- scripts/distillation/ auditor/schemas/distillation/ ``` The diff is your bug.