3 Commits

Author SHA1 Message Date
root
5d3996b51d STATE_OF_PLAY: Rust is not maintenance-only as of 2026-05-02
Frames the Rust system accurately — it's receiving parity work +
infrastructure (Lance gauntlet, sidecar drop, observability parity),
not just security fixes. Points readers at lakehouse/STATE_OF_PLAY.md
+ docs/ARCHITECTURE_COMPARISON.md for the cross-runtime view.

Also commits today's parity probe report regenerations (5/5 still
32/32 post-Lance work).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 22:24:14 -05:00
root
a21a34b057 docs: close 2 cross-runtime parity gaps + document unified log
Companion to lakehouse 98b6647. Architecture comparison decisions
tracker now captures:

  - Go validatord direct header read (fixes 6847bbc): closes the
    case where Langfuse-off middleware passthrough silently dropped
    forwarded X-Lakehouse-Trace-Id
  - Rust IterateResponse trace_id echo (fixes 98b6647): closes the
    asymmetry where Go's response carried the join key and Rust's
    didn't
  - Unified longitudinal log demonstrated end-to-end: both daemons
    co-writing /tmp/lakehouse-validator/sessions.jsonl, distinct
    daemon tags, one DuckDB query covers both

24/24 parity assertions (validator 6/6, extract_json 12/12,
session_log 4/4, materializer 2/2) hold against live :3100 + :4110.
Both runtimes deployed with today's full stack.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 06:25:21 -05:00
root
b0c8a3f227 parity probes: materializer + extract_json (caught + fixed real bug)
Two new cross-runtime parity probes joining the validator probe from
the gauntlet wave. Pattern: feed identical input through Rust and Go;
diff outputs. Each probe surfaced a different signal.

## Materializer parity probe
scripts/cutover/parity/materializer_parity.sh runs Bun + Go
materializer against an identical synthetic data/_kb/ root, diffs the
resulting evidence/ JSONL byte-equivalent (modulo provenance.recorded_at).

**First run: 0/2 match.** Real finding: Go's Provenance.LineOffset
had `json:"line_offset,omitempty"` which strips the field when value
is 0. Line offset 0 is the FIRST ROW of every source file — a real
semantic value, not absent. Bun side always emits it.

Fix: drop `omitempty` on Provenance.LineOffset. Updated comment
explaining why.

**Re-run: 2/2 match.** On-wire JSON parity holds.

## extract_json parity probe
scripts/cutover/parity/extract_json_parity.sh feeds 12 fixture
strings through both runtimes' extract_json:
  - fenced ```json``` blocks
  - unfenced ``` blocks
  - bare braces with prose around
  - first-balanced-of-many
  - nested objects
  - unicode in string values
  - escaped quotes
  - empty object
  - top-level array (both return first inner object)
  - no JSON
  - depth-balanced but invalid syntax
  - trailing garbage

Substrate gate: cargo test -p gateway extract_json PASS before probe.

**Result: 12/12 match.** Algorithms genuinely equivalent.

## scripts/cutover/parity/extract_json_helper/main.go
Tiny Go binary that reads stdin, calls validator.ExtractJSON, prints
{matched, value} JSON. Counterpart to the Rust parity_extract_json
binary in golangLAKEHOUSE's sibling lakehouse repo (separate commit).

## Pattern crystallized
Every cross-runtime port should land with a parity probe. Three
probes now exist:
  - validator (5/6 wire-format gap captured 2026-05-02)
  - materializer (caught + fixed real bug 2026-05-02)
  - extract_json (12/12 match 2026-05-02)

The instrument is reusable — each new shared HTTP/CLI surface gets
a probe row added.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 04:43:54 -05:00