root 262a77a52a subject-audit parity (Step 8) — Go reader + cross-runtime probe
Per /home/profit/lakehouse/docs/specs/SUBJECT_MANIFESTS_ON_CATALOGD.md §5 Step 8.

Go side reads SubjectManifest + verifies HMAC chain on per-subject
audit JSONL files using IDENTICAL canonical-JSON + HMAC-SHA256 algorithm
to crates/catalogd/src/subject_audit.rs. A Rust-written chain now
verifies under Go and vice versa.

Files:
  - internal/catalogd/subject.go
      SubjectManifest, SubjectAuditRow, AuditAccessor, AuditLogEntry
      LoadSubjectManifest, LoadKeyFile (32-byte minimum, matches Rust)
      ReadAuditLog, VerifyChain
      canonicalRowBytesFromRaw (production), canonicalRowBytesFromStruct (tests)
      computeRowHMAC, CanonicalAndHmac (parity helper)
  - internal/catalogd/subject_test.go (10 unit tests)
  - scripts/cutover/parity/subject_audit_helper/main.go
      CLI helper mirroring crates/catalogd/src/bin/parity_subject_audit.rs
  - scripts/cutover/parity/subject_audit_parity.sh
      Two-phase probe: known-answer + every real audit log

Two real bugs caught + fixed by the probe authoring loop:

1. omitempty on AuditAccessor.TraceID stripped the field when empty,
   producing different canonical bytes than Rust (which always writes
   the field). Removed omitempty. Rust + Go now produce identical
   bytes for rows with trace_id="" (the common production case).

2. time.RFC3339Nano strips trailing zeros from nanoseconds, producing
   "...46143921" where Rust's chrono AutoSi produces "...461439210".
   Hashing through the parsed-then-re-marshaled struct breaks the
   chain on any row whose nanos end in 0. Fixed by canonicalizing
   from the RAW LINE BYTES (preserves the original timestamp string
   byte-for-byte). Test TestVerifyChain_RawBytesPreserveTimePrecision
   regression-locks this with a hand-crafted nanos=461439210 row.

Live verification (6 / 6 byte-identical assertions):
  - Phase 1 known-answer: canonical bytes (266) + HMAC match
  - Phase 2 real logs: WORKER-1..5 audit JSONL all verify under both
    runtimes with identical (count, tip, verified, error) output

Report: reports/cutover/gauntlet_2026-05-02/parity/subject_audit_parity.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:17:15 -05:00

1.6 KiB

subject_audit_parity

Generated: 2026-05-03 09:16:14 UTC Spec: /home/profit/lakehouse/docs/specs/SUBJECT_MANIFESTS_ON_CATALOGD.md §5 Step 8 Rust helper: /home/profit/lakehouse/target/release/parity_subject_audit Go helper: ./bin/subject_audit_helper Audit dir: /home/profit/lakehouse/data/_catalog/subjects

Phase 1 — Known-answer vector

Hardcoded fixture row, identical inputs, byte-compare canonical-JSON + HMAC.

MATCH

{"mode":"known_answer","canonical":"{\"accessor\":{\"daemon\":\"gateway\",\"kind\":\"gateway_lookup\",\"purpose\":\"parity_test\",\"trace_id\":\"trace-fixed\"},\"candidate_id\":\"WORKER-FIXED\",\"fields_accessed\":[\"name\"],\"prev_chain_hash\":\"GENESIS\",\"result\":\"success\",\"schema\":\"subject_audit.v1\",\"ts\":\"2026-05-03T12:00:00Z\"}","hmac":"f730fa038c847c27386b92eb1939ec64c62086c0a92617ac0bdf9f650c390b96","canonical_bytes_len":266}

Phase 2 — Real production audit logs

Every *.audit.jsonl in /home/profit/lakehouse/data/_catalog/subjects verified by both runtimes.

Audit log Rust verified Go verified Result
WORKER-1.audit.jsonl 1 rows (true) 1 rows (true) MATCH
WORKER-2.audit.jsonl 1 rows (true) 1 rows (true) MATCH
WORKER-3.audit.jsonl 1 rows (true) 1 rows (true) MATCH
WORKER-4.audit.jsonl 1 rows (true) 1 rows (true) MATCH
WORKER-5.audit.jsonl 1 rows (true) 1 rows (true) MATCH

Summary

6 / 6 parity assertions passed.

Status: PARITY — every Rust assertion matches Go byte-for-byte.