golangLAKEHOUSE/reports/cutover/embed_parity_20260430_v1.md
root 5687ec65c2 G5 cutover prep: embed parity probe — Rust /ai/embed ↔ Go /v1/embed verified
First concrete cutover artifact: scripts/cutover/embed_parity.sh
brings up Go embedd + gateway alongside the live Rust gateway,
hits both /ai/embed and /v1/embed with the same forced model, and
emits a per-date verdict report under reports/cutover/.

Why embed first: the parity invariant is one math identity (cosine
sim of vectors against same input). Retrieve has thousands of edge
cases. If embed parity holds, all downstream vector consumers
inherit confidence; if it doesn't, we catch it in 30s instead of
after a flip.

Verdict 2026-04-30: 5/5 samples cosine=1.000000 with model forced
to nomic-embed-text (v1). Same with nomic-embed-text-v2-moe (both
Ollamas have it loaded). Math is provably equivalent across the
gateway plumbing.

Drift catalog (reports/cutover/SUMMARY.md):
- URL: Rust /ai/embed vs Go /v1/embed
- Wire: Rust {embeddings, dimensions} (plural) vs Go {vectors,
  dimension} (singular). Wire-format adapter is the only real
  cutover work for this endpoint.
- L2 norm: Rust unit vectors (~1.0); Go raw Ollama (~20-23). Same
  direction (cos=1.0); harmless under cosine-distance HNSW (which
  is Go vectord's default), but worth fixing in internal/embed/
  before extending to euclidean indexes.

reports/cutover/ now tracked (joined the scrum/ + reality-tests/
exemptions in .gitignore).

Next probe: /v1/matrix/retrieve ↔ Rust /vectors/hybrid for the
real user-facing retrieve path. Embed parity gives that probe a
clean foundation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 20:07:04 -05:00

1.9 KiB

Embed parity probe — 2026-04-30T20:04-05:00

Forced model: nomic-embed-text on both sides (isolates plumbing from default-model drift; Rust default = v1, Go default = v2-moe).

# Sample (head) Dim R/G Cosine L2 R L2 G Max|Δ|
1 hello 768 / 768 1.000000 1.000000 23.500095 3.943768
2 forklift operator with OSHA cert 768 / 768 1.000000 1.000000 21.434569 3.591839
3 Need 5 production workers in Aurora IL f 768 / 768 1.000000 1.000000 20.636244 3.937658
4 résumé: 12 yrs warehouse — pick/pack 768 / 768 1.000000 1.000000 19.695088 3.624522
5 Q: who's available next Friday? A: Bob, 768 / 768 1.000000 1.000000 22.052443 3.720862

Verdict

PASS — 5/5 samples ≥ 0.9990 cosine similarity. Gateway plumbing is at-parity for embed.

First-flip ready: nginx-side or Bun-side routing of /ai/embed to Go's /v1/embed (with the wire-format remap noted in §Drift below) is safe to attempt.

Drift notes

  • URL prefix: Rust uses /ai/embed (nested under /ai); Go uses /v1/embed (gateway strips /v1 then forwards to embedd at :3216/embed).
  • Wire format: Rust returns {embeddings, model, dimensions} (plural); Go returns {vectors, model, dimension} (singular). A flip needs either a wire-shape adapter on the Go side, or callers updated to handle both shapes.
  • Default model: Rust default = nomic-embed-text (v1, 137M); Go default = nomic-embed-text-v2-moe (v2 MoE, 475M). This probe forces v1 on both to isolate plumbing parity. The v2-moe upgrade is intentional and a separate dimension.

Repro

cd /home/profit/golangLAKEHOUSE
./scripts/cutover/embed_parity.sh                 # default: model=nomic-embed-text
MODEL=nomic-embed-text-v2-moe ./scripts/cutover/embed_parity.sh  # measure embedder drift