Same shape of proof as embed_parity.sh for the embed endpoint:
take the just-shipped Go port (ca142b9) and validate it against
the actual production data the Rust legacy emits, not just unit-
test fixtures. Locks the cross-runtime parity that operators
running mixed pipelines depend on.
scripts/cutover/audit_baselines_validate.go:
- Reads /home/profit/lakehouse/data/_kb/audit_baselines.jsonl
- Parses every entry via the Go AuditBaseline struct
- Round-trips the last entry: encode → decode → field-by-field
equality check (catches any silently-dropped JSON keys)
- Calls LoadLastBaseline against the live file (proves the public
API works on real shapes, not just inline parsing)
- Computes BuildAuditDriftTable(first → last) — full-window
lineage drift over the captured baselines
Live-data probe results (reports/cutover/audit_baselines_roundtrip.md):
- 7 entries parse without error
- Round-trip is byte-equal on every metric + every header field
- Drift table fires the expected verdicts:
- p2_evidence_rows 12→82 (+583%) → warn (above 20% threshold)
- p3_accepted/partial/rejected/human 0→non-zero → warn (the
zero-baseline edge case TestBuildAuditDriftTable_ZeroBaseline
was designed to lock — verified now firing on real history)
- p4_* metrics +0% → ok (stable across the window)
What this does NOT prove (documented in the report): the Go-side
audit-FULL pipeline that PRODUCES baselines doesn't exist yet.
Only the load/append/drift substrate is ported. Operators running
audit-full from Go would still need a metric-collection pass —
that's a separate port deliberately not in this wave.
reports/cutover/SUMMARY.md gains a new row alongside the embed
parity entries; cutover-prep verification log keeps the
discipline of "verified against real data, not just fixtures."
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
84 lines
3.6 KiB
Markdown
84 lines
3.6 KiB
Markdown
# Audit-baselines port — round-trip validation against live Rust data
|
||
|
||
Proves the Go port at `internal/distillation/audit_baseline.go`
|
||
parses, round-trips, and produces meaningful drift signal against
|
||
the live Rust-side `data/_kb/audit_baselines.jsonl`. Same shape of
|
||
proof as `embed_parity.sh` for the embed endpoint earlier in the
|
||
session — port verified against real-shape data, not just fixtures.
|
||
|
||
## Verdict
|
||
|
||
**PASS.** The Go port reads the live file end-to-end. JSON
|
||
round-trip is byte-equal on every field. `BuildAuditDriftTable`
|
||
produces the expected verdict tiers when fed real-history data.
|
||
|
||
## Live-data probe output
|
||
|
||
```
|
||
loaded 7 baselines from /home/profit/lakehouse/data/_kb/audit_baselines.jsonl
|
||
|
||
✓ round-trip parity (encode → decode → match)
|
||
✓ LoadLastBaseline returns the most recent entry
|
||
|
||
Lineage drift: first (2026-04-27T04:47:30.220Z) → last (2026-04-27T15:43:38.019Z)
|
||
span: 7 entries
|
||
|
||
metric baseline current Δ% flag
|
||
p2_evidence_rows 12 82 +583.3% warn
|
||
p2_evidence_skips 2 2 +0.0% ok
|
||
p3_accepted 0 386 - warn
|
||
p3_human 0 480 - warn
|
||
p3_partial 0 132 - warn
|
||
p3_rejected 0 57 - warn
|
||
p4_pref_pairs 83 83 +0.0% ok
|
||
p4_rag_rows 448 448 +0.0% ok
|
||
p4_sft_rows 353 353 +0.0% ok
|
||
p4_total_quarantined 1325 1325 +0.0% ok
|
||
|
||
verdict: 5/10 metrics flagged warn, 0 first-run
|
||
```
|
||
|
||
## What this confirms
|
||
|
||
1. **Field-name parity is exact.** All 10 metric fields decode
|
||
into the Go `AuditBaseline.Metrics map[string]int64` shape; no
|
||
silently-dropped keys.
|
||
2. **Header fields map cleanly.** `recorded_at` + `git_commit` are
|
||
the only non-Metrics fields in the Rust shape, both already
|
||
present on the Go struct.
|
||
3. **The zero-baseline edge case fires correctly.** `p3_accepted`
|
||
went 0→386 between first and last baseline — a metric that
|
||
didn't exist in the early window. The drift table flagged it
|
||
`warn` (zero→nonzero is always notable) without throwing on
|
||
the division-by-zero. This was the specific case
|
||
`TestBuildAuditDriftTable_ZeroBaseline` was designed to lock,
|
||
and it's hitting the real-data behavior I wanted.
|
||
4. **The +583% drift on `p2_evidence_rows` is honest signal.** The
|
||
pipeline scaled from 12 to 82 evidence rows over the captured
|
||
window — well above the 20% warn threshold. Operator running
|
||
this in CI would see "the audit pipeline output 7× more
|
||
evidence than baseline; investigate" — which is exactly the
|
||
point of audit_baselines.
|
||
|
||
## Repro
|
||
|
||
```bash
|
||
go run ./scripts/cutover/audit_baselines_validate
|
||
# Or override path:
|
||
go run ./scripts/cutover/audit_baselines_validate \
|
||
-path /path/to/audit_baselines.jsonl
|
||
```
|
||
|
||
## What this does NOT prove
|
||
|
||
- The Go-side audit-FULL pipeline that PRODUCES baselines doesn't
|
||
exist yet — only the load/append/drift substrate. Operators
|
||
running audit-full from Go would still need a metric-collection
|
||
pass equivalent to the Rust `auditPhase0..auditPhase7` chain.
|
||
That's a separate port, deliberately not in this wave.
|
||
- The `git_commit` field carries Rust git history (commits like
|
||
`ca7375ea` from the Rust legacy repo). A Go-side audit-full
|
||
would stamp `golangLAKEHOUSE` SHAs. The two are separate
|
||
lineages — the file format is shared, but the git-commit
|
||
references trace back to whichever repo emitted the entry.
|