Lance went from "deferred until corpus exceeds 5M rows" to verified
production-ready at 10M in a single wave (lakehouse repo commits
7594725 + 5d30b3d). Captures the 4-pack (sanitizer + tests + smoke +
bench) + the root-cause fix (auto-build doc_id btree in migrate handler).
Reframes the strategic question: HNSW at 10M doesn't fit RAM, so the
real choice is Lance vs Parquet+HNSW-with-spilling, deferred until we
have a workload where the Parquet path is the bottleneck.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Companion to lakehouse 98b6647. Architecture comparison decisions
tracker now captures:
- Go validatord direct header read (fixes 6847bbc): closes the
case where Langfuse-off middleware passthrough silently dropped
forwarded X-Lakehouse-Trace-Id
- Rust IterateResponse trace_id echo (fixes 98b6647): closes the
asymmetry where Go's response carried the join key and Rust's
didn't
- Unified longitudinal log demonstrated end-to-end: both daemons
co-writing /tmp/lakehouse-validator/sessions.jsonl, distinct
daemon tags, one DuckDB query covers both
24/24 parity assertions (validator 6/6, extract_json 12/12,
session_log 4/4, materializer 2/2) hold against live :3100 + :4110.
Both runtimes deployed with today's full stack.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Companion to lakehouse commit 57bde63 (Rust gateway gains
trace-id propagation + coordinator session JSONL). The
cross-runtime parity probe is the regression gate that prevents
silent schema drift between the two runtimes.
scripts/cutover/parity/session_log_parity.sh:
- 4 fixtures (accepted_grounded, max_iter_exhausted, infra_error,
unicode_in_prompt) feed identical input to both helpers
- jq -e validity gate + non-trivial-equal guard prevents the
"both sides fail identically → spurious match" failure mode
(caught one IFS='||' bug during initial authoring — recorded
in the script comment)
- normalize() strips timestamp + daemon (legitimate per-producer
differences); everything else must be byte-equal
- Result: 4/4 fixtures match, including unicode
scripts/cutover/parity/session_log_helper/main.go:
- Tiny stdin/stdout Go helper that round-trips a fixture
through validator.SessionRecord serde
- Counterpart to crates/gateway/src/bin/parity_session_log.rs
docs/ARCHITECTURE_COMPARISON.md decisions tracker:
- "Rust observability parity" row added (DONE 2026-05-02)
- Cross-runtime probe documented as reusable gate
STATE_OF_PLAY refreshed.
Both observability pieces (trace-id propagation, session JSONL)
now exist on both runtimes. Operators who point Rust gateway and
Go validatord at the same session-log path get a unified
longitudinal stream queryable via DuckDB.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the 2026-05-02 parity finding: validator_parity probe found
5/6 body shapes diverging because Go emitted {"Kind":"...","Field":"...","Reason":"..."}
while Rust emits the externally-tagged-enum {"Schema":{"field":"...","reason":"..."}}.
A caller parsing the error envelope would break silently in cutover.
## Changes
internal/validator/types.go:
- Custom MarshalJSON emits the Rust shape:
Schema: {"Schema": {"field":"x","reason":"y"}}
Completeness: {"Completeness":{"reason":"y"}}
Consistency: {"Consistency": {"reason":"y"}}
Policy: {"Policy": {"reason":"y"}}
- Custom UnmarshalJSON accepts BOTH the new Rust shape AND the legacy
flat shape (migration safety for any persisted error rows).
- Unknown variants (e.g. a future Rust addition Go hasn't learned)
surface as an Unmarshal error, not a silent default.
internal/validator/types_test.go:
- 4 pinning tests anchor the wire format. Failing them = wire-format
drift; the parity probe is the secondary line of defense.
scripts/validatord_smoke.sh:
- Updated probes to read the new variant-name shape (jq keys[0],
.Schema.field) instead of legacy .Kind/.Field.
## Verification
- internal/validator unit tests: PASS (4 new + all existing).
- cmd/validatord HTTP tests: PASS (UnmarshalJSON falls through to flat
shape so existing tests reading ValidationError still work).
- validatord_smoke.sh: 5/5 PASS through gateway :3110.
- validator parity probe re-run: **6/6 match** (was 1/6).
## Pattern
Per architecture_comparison's "use the dual-implementation as a
measurement instrument" thesis: a parity probe surfaced this gap;
50 LOC of MarshalJSON closed it; 4 pinning tests prevent regression;
the probe is the longitudinal gate. Cutover-friendly direction (Go
matches Rust) chosen because Rust is the existing production
contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two new cross-runtime parity probes joining the validator probe from
the gauntlet wave. Pattern: feed identical input through Rust and Go;
diff outputs. Each probe surfaced a different signal.
## Materializer parity probe
scripts/cutover/parity/materializer_parity.sh runs Bun + Go
materializer against an identical synthetic data/_kb/ root, diffs the
resulting evidence/ JSONL byte-equivalent (modulo provenance.recorded_at).
**First run: 0/2 match.** Real finding: Go's Provenance.LineOffset
had `json:"line_offset,omitempty"` which strips the field when value
is 0. Line offset 0 is the FIRST ROW of every source file — a real
semantic value, not absent. Bun side always emits it.
Fix: drop `omitempty` on Provenance.LineOffset. Updated comment
explaining why.
**Re-run: 2/2 match.** On-wire JSON parity holds.
## extract_json parity probe
scripts/cutover/parity/extract_json_parity.sh feeds 12 fixture
strings through both runtimes' extract_json:
- fenced ```json``` blocks
- unfenced ``` blocks
- bare braces with prose around
- first-balanced-of-many
- nested objects
- unicode in string values
- escaped quotes
- empty object
- top-level array (both return first inner object)
- no JSON
- depth-balanced but invalid syntax
- trailing garbage
Substrate gate: cargo test -p gateway extract_json PASS before probe.
**Result: 12/12 match.** Algorithms genuinely equivalent.
## scripts/cutover/parity/extract_json_helper/main.go
Tiny Go binary that reads stdin, calls validator.ExtractJSON, prints
{matched, value} JSON. Counterpart to the Rust parity_extract_json
binary in golangLAKEHOUSE's sibling lakehouse repo (separate commit).
## Pattern crystallized
Every cross-runtime port should land with a parity probe. Three
probes now exist:
- validator (5/6 wire-format gap captured 2026-05-02)
- materializer (caught + fixed real bug 2026-05-02)
- extract_json (12/12 match 2026-05-02)
The instrument is reusable — each new shared HTTP/CLI surface gets
a probe row added.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Production-readiness gauntlet exploiting the dual Rust/Go
implementation as a measurement instrument.
## Phase 1 — Full smoke chain
21/21 PASS in ~60s. Substrate intact across the full service surface.
## Phase 2 — Per-component scrum (token-volume fix)
Prior wave (165KB diff): Kimi 62 tokens out, Qwen 297 → no useful
analysis. This wave splits today's commits into 4 focused bundles
(36-71KB each):
c1 validatord (46KB) → 0 convergent / 11 distinct
c2 vectord substrate (36KB) → 0 convergent / 10 distinct
c3 materializer (71KB) → 0 convergent / 6 distinct (Opus emitted
a BLOCK then self-retracted in same response)
c4 replay (45KB) → 0 convergent / 10 distinct
Reviewer engagement vs prior wave: Kimi went 62 → ~250 tokens out
once bundles dropped below 60KB.
scripts/scrum_review.sh hardening:
* Diff-size guard (warn >60KB, hard-fail >100KB,
SCRUM_FORCE_OVERSIZE=1 override)
* Tightened prompt — file path must appear EXACTLY as in diff
so post-processor can grep WHERE: lines reliably
* Auto-tally step dedupes by (reviewer, location); convergence
counts distinct lineages (closes the prior `opus+opus+opus`
false-convergence bug)
## Phase 3 — Cross-runtime validator parity probe (the headline finding)
scripts/cutover/parity/validator_parity.sh sends 6 identical
/v1/validate cases to Rust :3100 AND Go :4110, compares status+body.
Result: **6/6 status codes match · 5/6 body shapes diverge.**
Rust returns serde-tagged enum: {"Schema":{"field":"x","reason":"y"}}
Go returns flat exported-fields: {"Kind":"schema","Field":"x","Reason":"y"}
Both round-trip inside their own runtime; a caller swapping one for
the other would break parsing silently. Captured as new _open_ row
in docs/ARCHITECTURE_COMPARISON.md decisions tracker.
This is the "use the dual-implementation as a measurement instrument"
return — single-repo scrums can't catch this class of cross-runtime
drift.
## Phase 4 — Production assessment
ship-with-known-gap. Validator wire-format gap is documented, not
regressed. ~50 LOC future fix on Go side (custom MarshalJSON on
ValidationError to match Rust's serde shape).
Persistent stack config (/tmp/lakehouse-persistent.toml) gains
validatord on :3221 + persistent-validatord binary so operators
bringing up the persistent stack get the new daemon automatically.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two threads landing together — the doc edits interleave so they ship
in a single commit.
1. **vectord substrate fix verified at original scale** (closes the
2026-05-01 thread). Re-ran multitier 5min @ conc=50: 132,211
scenarios at 438/sec, 6/6 classes at 0% failure (was 4/6 pre-fix).
Throughput dropped 1,115 → 438/sec because previously-broken
scenarios now do real HNSW Add work — honest cost of correctness.
The fix (i.vectors side-store + safeGraphAdd recover wrappers +
smallIndexRebuildThreshold=32 + saveTask coalescing) holds at the
footprint that originally surfaced the bug.
2. **Materializer port** — internal/materializer + cmd/materializer +
scripts/materializer_smoke.sh. Ports scripts/distillation/transforms.ts
(12 transforms) + build_evidence_index.ts (idempotency, day-partition,
receipt). On-wire JSON shape matches TS so Bun and Go runs are
interchangeable. 14 tests green.
3. **Replay port** — internal/replay + cmd/replay +
scripts/replay_smoke.sh. Ports scripts/distillation/replay.ts
(retrieve → bundle → /v1/chat → validate → log). Closes audit-FULL
phase 7 live invocation on the Go side. Both runtimes append to the
same data/_kb/replay_runs.jsonl (schema=replay_run.v1). 14 tests green.
Side effect on internal/distillation/types.go: EvidenceRecord gained
prompt_tokens, completion_tokens, and metadata fields to mirror the TS
shape the materializer transforms produce.
STATE_OF_PLAY refreshed to 2026-05-02; ARCHITECTURE_COMPARISON decisions
tracker moves the materializer + replay items from _open_ to DONE and
adds the substrate-fix scale verification row.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
J asked for a much more sophisticated test using the 100k corpus from
the Rust legacy database. This commit ships:
scripts/cutover/multitier/main.go — 6-scenario harness with weighted
random selection per goroutine. Mixes search, email/SMS/fill
validators (in-process via internal/validator), profile swap with
ExcludeIDs, repeat-cache exercise, and playbook record/replay.
Scenarios + weights (cumulative scenario fractions):
35% cold_search_email — search + email outreach + EmailValidator
15% surge_fill_validate — search + fill proposal + FillValidator + record
15% profile_swap — original search + ExcludeIDs swap + no-overlap check
15% repeat_cache — same query × 5 (cache effectiveness)
10% sms_validate — SMS draft (≤160 chars, phone for SSN-FP guard)
10% playbook_record_replay — cold → record → warm w/ use_playbook=true
Test results (5-min sustained, conc=50, 100k workers indexed):
TOTAL 335,257 scenarios @ 1,115/sec
cold_search_email 117k @ 0.0% fail · p50 2.2ms · p99 8.6ms
surge_fill_validate 50k @ 98.8% fail (substrate bug below)
profile_swap 50k @ 0.0% fail · p50 4.5ms · ExcludeIDs verified
repeat_cache 50k × 5 = 252k searches @ 0.0% fail · p50 11.7ms
sms_validate 33k @ 0.0% fail · phone-pattern guard works
playbook_record_replay 33k @ 96.8% fail (substrate bug below)
Total successful workflows: ~250k+
Validator integration verified at load:
150,930 EmailValidator passes across cold_search_email + sms_validate
35 + 1,061 successful FillValidator + playbook_record (where the bug
didn't fire)
zero false positives on the SSN-pattern guard against phone numbers
Resource footprint at 100k:
vectord 1.23GB RSS (linear with 100k vectors)
matrixd 26MB, 75% CPU (1-core saturated at conc=50)
Total across 11 daemons: 1.7GB
Compare to Rust at 14.9GB — ~10× less even at 100k.
SUBSTRATE BUG SURFACED: coder/hnsw v0.6.1 nil-deref in
layerNode.search at graph.go:95. Triggers on /v1/matrix/playbooks/record
under sustained writes to the small playbook_memory index. Both Add
and Search paths can panic.
Workaround applied (this commit) in internal/vectord/index.go
BatchAdd: recover() guard converts panic to error; daemon stays up
instead of crashing the request handler.
Operator recovery procedure (also documented in the report):
curl -X DELETE http://localhost:4215/vectors/index/playbook_memory
Next record recreates the index fresh.
Real fix DEFERRED — open in docs/ARCHITECTURE_COMPARISON.md
Decisions tracker. Three options:
a) upstream patch to coder/hnsw
b) custom small-index Add path that always rebuilds when len < threshold
c) alternate store for playbook_memory (Lance? in-memory map?)
Evidence: reports/cutover/multitier_100k.md (full methodology +
results + repro + bug analysis). docs/ARCHITECTURE_COMPARISON.md
Decisions tracker updated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per J's request: move the parallel-runtime comparison from
reports/cutover/ (where it lived as cutover-prep evidence) into
docs/ as the source-of-truth file. J will keep updating it as
fixes ship on either side.
Restructured for living-document use:
- Status header (last refresh date, owner, update triggers)
- 'How to update this doc' section with explicit dos and don'ts
- Decisions tracker at top — actioned items with commit refs
+ open backlog with LOC estimates
- Each comparison section now has 'Last verified' columns where
numbers are time-sensitive
- Change log section at bottom for one-line entries on every
meaningful refresh
The original at reports/cutover/architecture_comparison.md gains
a 'THIS IS A SNAPSHOT' header pointing at the docs/ source. Kept
as historical record but no longer the place to update.
Sister pointer file in /home/profit/lakehouse/docs/ARCHITECTURE_COMPARISON.md
so the doc is reachable from either repo side. That file explicitly
says the source lives in golangLAKEHOUSE and warns against
authoritative content in the pointer.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>