Closes 4 of the 5 phases the initial audit-FULL port left as
deferred. The pattern: most "deferred" phases didn't actually need
the un-ported Rust pieces — they were observer-mode by design and
just needed to read existing on-disk artifacts.
Phase 1 (schema validators) → ported via exec.Command:
Invokes `go test ./internal/distillation/...` — the Go equivalent
of Rust's `bun test auditor/schemas/distillation/`. New
GoTestModule field on AuditFullOptions controls the package
pattern; empty disables the invocation (test mode, prevents
recursion when audit-full is invoked from inside `go test`).
Phase 2 (evidence materialization) → ported as observer:
Reads data/evidence/ directly and tallies rows + tier-1 source
hits. Doesn't re-run the materializer (which is Rust-side TS).
Emits p2_evidence_rows + p2_evidence_skips metrics matching
Rust shape — drop-in audit_baselines.jsonl entries possible.
Phase 5 (run summary) → ported as observer:
Reads reports/distillation/{run_id}/summary.json + 5 stage
receipts. Validates schema_version=1, run_hash sha256, git_commit
40-char hex, all stage receipts decode as JSON. Full schema
validation (StageReceipt schema) is intentionally NOT ported —
it would require porting the TS schemas/distillation/ validators
in full; basic shape checks catch the load-bearing invariants.
Phase 7 (replay log) → ported as observer:
Reads data/_kb/replay_runs.jsonl, validates last 50 rows parse
as JSON. Skips the live-replay invocation that Rust's phase 7
also does — porting Rust replay.ts is substantial and not in
scope. The "log shape sanity" check is what audit-full actually
needs; the live invocation is a separate concern.
Phase 6 (acceptance gate) — STILL SKIPPED:
Rust acceptance.ts is a TS-only fixture harness with bun-specific
deps. Porting the fixtures (tests/fixtures/distillation/acceptance/)
+ the 22-invariant runner to Go is an ADR-worth undertaking.
Documented in the header comment.
Live-data probe (against /home/profit/lakehouse):
Skips count: 4 → 1 (only phase 6).
Required checks: 6/6 → 12/12 PASS.
New metric: p2_evidence_rows=1055, BYTE-EQUAL to the Rust
pipeline's collect.records_out from the latest summary.json.
Cross-runtime parity now extends across phases 0/1/2/3/4/5/7.
6 new tests:
- TestPhase2_EvidenceTallyFromOnDisk: row + tier-1-hit tallying
- TestPhase5_FullSummaryFlow: complete run-summary fixture passes
- TestPhase5_ShortRunHashCaught: bad run_hash fails required check
- TestPhase7_ReplayLogReadsFromDisk: row-count reporting
- TestPhase7_MalformedTailRowsCaught: structural parse failure
- TestRunAuditFull_FullFixtureFlow updated to seed evidence/ +
reports/distillation/ for the phases now wired.
Cleanup: removed local sortStrings helper (replaced with sort.Strings
now that `sort` is imported for phase 5's mtime-sort).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2.6 KiB
2.6 KiB
Audit-FULL report (Go)
git HEAD: 55b8c76a8c21a6c3d3ea109cae8d06ccb66fae51
Verdict: PASS — 12/12 required checks passed; 1 phase(s) deferred.
Checks
| phase | name | expected | actual | required | passed |
|---|---|---|---|---|---|
| 0 | recon doc exists | docs/recon/local-distillation-recon.md present | true | no | ✓ |
| 0 | tier-1 source streams present | all 4 tier-1 jsonls on disk | all present | no | ✓ |
| 1 | schema validators (skipped — test invocation disabled) | go test ./internal/distillation/... | skipped | no | ✓ |
| note | caller passed empty GoTestModule — typically because we're already inside a test run | ||||
| 2 | evidence materialization output non-empty | >=1 row across all sources | 1055 rows · 0 skipped | yes | ✓ |
| 2 | tier-1 sources each materialize ≥1 row | 4/4: distilled_facts, scrum_reviews, audit_facts, mode_experiments | 4/4 hit (audit_facts, distilled_facts, mode_experiments, scrum_reviews) | no | ✓ |
| 3 | on-disk scored-runs distribution non-empty | >=1 accepted | acc=386 part=132 rej=57 hum=480 | yes | ✓ |
| 3 | scored-runs distribution sums positive | >0 total | 1055 total | no | ✓ |
| 4 | SFT contamination firewall: 0 forbidden quality_scores | 0 | 0 | yes | ✓ |
| note | this is the spec non-negotiable — rejected/needs_human_review must NEVER appear in SFT | ||||
| 4 | RAG firewall: 0 rejected leaks | 0 | 0 | yes | ✓ |
| 4 | Preference: 0 self-pairs (chosen_run_id != rejected_run_id) | 0 | 0 | yes | ✓ |
| 4 | Preference: 0 identical-text pairs | 0 | 0 | yes | ✓ |
| 4 | every export row carries valid sha256 provenance.sig_hash | 0 missing | 0 missing | yes | ✓ |
| 5 | latest run (3fa51d66-784c-4c7d-843d-6c48328a608c) has all 5 stage receipts | collect,score,export-rag,export-sft,export-preference | all present | yes | ✓ |
| 5 | every stage receipt parses as JSON | 0 invalid | 0 invalid | yes | ✓ |
| 5 | summary.schema_version == 1 | 1 | 1 | yes | ✓ |
| 5 | summary.git_commit is 40-char hex | /^[0-9a-f]{40}$/ | 68b6697bcb38ec15... | no | ✓ |
| 5 | run_hash is sha256 | /^[0-9a-f]{64}$/ | 2336b96c3638982d... | yes | ✓ |
| 7 | replay_runs.jsonl exists | exists with ≥1 row | 27 rows total | no | ✓ |
| 7 | replay_runs.jsonl tail rows parse as JSON | 0 malformed in last 50 | 0 malformed | yes | ✓ |
Metrics
| metric | value |
|---|---|
| p2_evidence_rows | 1055 |
| p2_evidence_skips | 0 |
| p3_accepted | 386 |
| p3_human | 480 |
| p3_partial | 132 |
| p3_rejected | 57 |
| p4_pref_pairs | 83 |
| p4_rag_rows | 448 |
| p4_sft_rows | 353 |
| p4_total_quarantined | 1325 |