Ports the metric-collection passes from scripts/distillation/audit_full.ts. The substrate that PRODUCES audit_baselines.jsonl entries — the half OPEN #2 left as "deferred to next wave" after the read/write substrate landed in ca142b9. Phase coverage: Phase 0 (file presence) ported Phase 1 (schema validators) skipped (Go's `go test` covers it) Phase 2 (materializer dry-run) deferred (Go materializer not yet ported) Phase 3 (scored-runs distribution) ported Phase 4 (contamination firewall) ported Phase 5 (receipts validation) deferred (Go run-summary JSON not yet emitted) Phase 6 (replay sanity) deferred (Go replay tool not ported) Phase 7 (run summary lineage) deferred (same) Cross-runtime parity verified end-to-end: Go-side audit-full against /home/profit/lakehouse produced metrics IDENTICAL to the last Rust-emitted audit_baselines.jsonl entry. All 8 ported metrics match byte-for-byte: p3_accepted=386, p3_partial=132, p3_rejected=57, p3_human=480, p4_sft_rows=353, p4_rag_rows=448, p4_pref_pairs=83, p4_total_quarantined=1325 6/6 required checks pass on live data. Components: - internal/distillation/audit_full.go: PhaseCheck struct (mirrors Rust shape), PhaseCheckReport aggregation, RunAuditFull orchestrator, auditPhase0/3/4 implementations, FormatAuditFullReport Markdown writer. - cmd/audit_full/main.go: CLI binary with -root, -out, -json, -append-baseline flags. Operators run "./bin/audit_full -append-baseline" to grow the longitudinal log alongside the Rust pipeline (entries are interchangeable — same envelope shape). - 6 new tests: empty-root failure handling, full-fixture clean PASS (locks all 8 metrics + all 6 required checks), SFT firewall contamination detection, preference self-pair detection, sig_hash regex correctness (rejects wrong-length + uppercase), Markdown formatter smoke. Live-data probe captured at reports/cutover/audit_full_go_vs_rust.md (linked from reports/cutover/SUMMARY.md). Same shape as the audit_baselines round-trip evidence — both Go-side ports of the distillation surface are now validated against real Rust data, not just fixtures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
62 lines
3.7 KiB
Markdown
62 lines
3.7 KiB
Markdown
# G5 cutover prep — verified-parity log
|
|
|
|
What works on Go gateway, what's been side-by-side compared to Rust,
|
|
what's safe to flip. Append a row when a new endpoint clears parity.
|
|
|
|
| Endpoint | Date | Rust path | Go path | Verdict | Notes |
|
|
|---|---|---|---|---|---|
|
|
| `embed` (forced v1) | 2026-04-30 | `/ai/embed` | `/v1/embed` | ✅ PASS 5/5 cos=1.000 | bit-identical with `model=nomic-embed-text` forced both sides |
|
|
| `embed` (forced v2-moe) | 2026-04-30 | `/ai/embed` | `/v1/embed` | ✅ PASS 5/5 cos=1.000 | bit-identical with `model=nomic-embed-text-v2-moe` forced both sides — both Ollamas have the model |
|
|
| `audit_baselines.jsonl` | 2026-05-01 | `data/_kb/audit_baselines.jsonl` | `internal/distillation` `LoadLastBaseline` / `AppendBaseline` / `BuildAuditDriftTable` | ✅ PASS round-trip | Live Rust file (7 entries) parses + round-trips byte-equal; lineage drift table fires correctly on zero-baseline metrics. See `audit_baselines_roundtrip.md`. |
|
|
| `audit-FULL` (phases 0/3/4) | 2026-05-01 | `scripts/distillation/audit_full.ts` | `cmd/audit_full` + `internal/distillation` `RunAuditFull` | ✅ PASS metric-equal | Go-side run against live Rust root: all 8 ported metrics (p3_*, p4_*) byte-equal to the last Rust-emitted `audit_baselines.jsonl` entry. 6/6 required checks pass. 4 phases (1, 2, 5, 6, 7) deferred — depend on broader Rust-side pieces (materializer / replay / run-summaries) not yet ported. See `audit_full_go_vs_rust.md`. |
|
|
|
|
## Wire-format drift catalog
|
|
|
|
The Go gateway is *not* a literal nginx-swap drop-in for the Rust
|
|
gateway. Anything that flips needs a wire-shape adapter. Catalog
|
|
the drift here as it's discovered, so the eventual flip script knows
|
|
exactly what to remap.
|
|
|
|
### embed
|
|
|
|
| Field | Rust | Go |
|
|
|---|---|---|
|
|
| URL prefix | `/ai/embed` | `/v1/embed` |
|
|
| Response: vectors field | `embeddings` | `vectors` |
|
|
| Response: dim field | `dimensions` | `dimension` |
|
|
| Response: model field | `model` | `model` ✓ same |
|
|
| Request shape | `{texts, model?}` | `{texts, model?}` ✓ same |
|
|
| L2 normalization | unit vectors (‖v‖ ≈ 1.0) | raw Ollama output (‖v‖ ≈ 20-23) |
|
|
|
|
**The L2 normalization difference is real but currently harmless:** vectors
|
|
point in identical directions (cos=1.000) but Go has raw magnitudes. Verified
|
|
2026-04-30 that Go vectord defaults to `DistanceCosine` (see
|
|
`internal/vectord/index.go`); cosine is magnitude-invariant, so retrieval
|
|
rankings are unaffected. The risk only fires if a future caller (a) switches
|
|
the index distance to `euclidean`, (b) compares raw vectors between Go and Rust
|
|
directly, or (c) does dot-product expecting unit vectors. Adding a
|
|
normalization step in `internal/embed/embed.go` would make the cutover safer
|
|
and is cheap — but not blocking.
|
|
|
|
## Repro
|
|
|
|
```bash
|
|
./scripts/cutover/embed_parity.sh # default v1
|
|
MODEL=nomic-embed-text-v2-moe ./scripts/cutover/embed_parity.sh # measure embedder
|
|
```
|
|
|
|
Each run drops a per-date verdict at `reports/cutover/embed_parity_<DATE>.md`.
|
|
|
|
## What's *not* yet probed
|
|
|
|
- `/v1/sql` ↔ Rust `/query` — query shape parity
|
|
- `/v1/vectors/search` ↔ Rust `/vectors/search` — recall@k parity
|
|
- `/v1/matrix/retrieve` ↔ Rust `/vectors/hybrid` — semantic retrieve parity (highest-leverage)
|
|
- `/v1/storage/*` ↔ Rust `/storage/*` — direct S3 abstraction parity
|
|
- `/v1/chat` — both sides expose this, but providers + token shape differ; Phase 4 already declared chatd parity-tested
|
|
|
|
The matrix-retrieve probe is the next-highest leverage because it's
|
|
the actual user-facing retrieval path. Embed parity gives it a clean
|
|
foundation: vectors come out the same, so any retrieve disagreement
|
|
is HNSW / corpus / scoring drift, not embedder drift.
|