3 Commits

Author SHA1 Message Date
root
0e530f4436 drift fix: validatord in start_go_stack + parity refresh
Two anchor-vs-reality drifts found during /read-mem audit:

1. start_go_stack.sh never started validatord :3221, even though
   it shipped 2026-05-02 (f9e7241) and STATE_OF_PLAY claims it as
   part of the persistent stack. Cold-boot quietly omitted it,
   leaving /v1/iterate unreachable on the persistent gateway.
   Fix: factored chatd's conditional-start block into a start_shared
   helper, called for both chatd :3220 and validatord :3221. Same
   shared-with-smokes posture as chatd (no S3 / JSONL-only state,
   no temp-toml override needed).

2. STATE_OF_PLAY header claimed 3 parity probes / 32 assertions.
   Reality is 6 probes / 38 assertions since subject_audit landed
   in 262a77a (2026-05-03). Header refreshed; cross-references
   the three runtime-divergence classes documented at
   lakehouse/STATE_OF_PLAY.md lines 36-39.

Parity reports regenerated as verification artifact (all 6 still
green: 8+12+2+4+1+6). Same pattern as c0a55b1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:27:47 -05:00
root
c0a55b1182 parity reports: regenerated 2026-05-03 morning verification
All 6 probes re-run post-restart for today's verification:
  validator(6/6) + extract_json(12/12) + session_log(4/4) +
  materializer(2/2) + embed(8/8) + subject_audit(6/6) = 38/38
2026-05-03 05:27:14 -05:00
root
262a77a52a subject-audit parity (Step 8) — Go reader + cross-runtime probe
Per /home/profit/lakehouse/docs/specs/SUBJECT_MANIFESTS_ON_CATALOGD.md §5 Step 8.

Go side reads SubjectManifest + verifies HMAC chain on per-subject
audit JSONL files using IDENTICAL canonical-JSON + HMAC-SHA256 algorithm
to crates/catalogd/src/subject_audit.rs. A Rust-written chain now
verifies under Go and vice versa.

Files:
  - internal/catalogd/subject.go
      SubjectManifest, SubjectAuditRow, AuditAccessor, AuditLogEntry
      LoadSubjectManifest, LoadKeyFile (32-byte minimum, matches Rust)
      ReadAuditLog, VerifyChain
      canonicalRowBytesFromRaw (production), canonicalRowBytesFromStruct (tests)
      computeRowHMAC, CanonicalAndHmac (parity helper)
  - internal/catalogd/subject_test.go (10 unit tests)
  - scripts/cutover/parity/subject_audit_helper/main.go
      CLI helper mirroring crates/catalogd/src/bin/parity_subject_audit.rs
  - scripts/cutover/parity/subject_audit_parity.sh
      Two-phase probe: known-answer + every real audit log

Two real bugs caught + fixed by the probe authoring loop:

1. omitempty on AuditAccessor.TraceID stripped the field when empty,
   producing different canonical bytes than Rust (which always writes
   the field). Removed omitempty. Rust + Go now produce identical
   bytes for rows with trace_id="" (the common production case).

2. time.RFC3339Nano strips trailing zeros from nanoseconds, producing
   "...46143921" where Rust's chrono AutoSi produces "...461439210".
   Hashing through the parsed-then-re-marshaled struct breaks the
   chain on any row whose nanos end in 0. Fixed by canonicalizing
   from the RAW LINE BYTES (preserves the original timestamp string
   byte-for-byte). Test TestVerifyChain_RawBytesPreserveTimePrecision
   regression-locks this with a hand-crafted nanos=461439210 row.

Live verification (6 / 6 byte-identical assertions):
  - Phase 1 known-answer: canonical bytes (266) + HMAC match
  - Phase 2 real logs: WORKER-1..5 audit JSONL all verify under both
    runtimes with identical (count, tip, verified, error) output

Report: reports/cutover/gauntlet_2026-05-02/parity/subject_audit_parity.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:17:15 -05:00