First concrete cutover artifact: scripts/cutover/embed_parity.sh
brings up Go embedd + gateway alongside the live Rust gateway,
hits both /ai/embed and /v1/embed with the same forced model, and
emits a per-date verdict report under reports/cutover/.
Why embed first: the parity invariant is one math identity (cosine
sim of vectors against same input). Retrieve has thousands of edge
cases. If embed parity holds, all downstream vector consumers
inherit confidence; if it doesn't, we catch it in 30s instead of
after a flip.
Verdict 2026-04-30: 5/5 samples cosine=1.000000 with model forced
to nomic-embed-text (v1). Same with nomic-embed-text-v2-moe (both
Ollamas have it loaded). Math is provably equivalent across the
gateway plumbing.
Drift catalog (reports/cutover/SUMMARY.md):
- URL: Rust /ai/embed vs Go /v1/embed
- Wire: Rust {embeddings, dimensions} (plural) vs Go {vectors,
dimension} (singular). Wire-format adapter is the only real
cutover work for this endpoint.
- L2 norm: Rust unit vectors (~1.0); Go raw Ollama (~20-23). Same
direction (cos=1.0); harmless under cosine-distance HNSW (which
is Go vectord's default), but worth fixing in internal/embed/
before extending to euclidean indexes.
reports/cutover/ now tracked (joined the scrum/ + reality-tests/
exemptions in .gitignore).
Next probe: /v1/matrix/retrieve ↔ Rust /vectors/hybrid for the
real user-facing retrieve path. Embed parity gives that probe a
clean foundation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
34 lines
1.9 KiB
Markdown
34 lines
1.9 KiB
Markdown
# Embed parity probe — 2026-04-30T20:04-05:00
|
|
|
|
Forced model: `nomic-embed-text-v2-moe` on both sides (isolates plumbing from
|
|
default-model drift; Rust default = v1, Go default = v2-moe).
|
|
|
|
| # | Sample (head) | Dim R/G | Cosine | L2 R | L2 G | Max\|Δ\| |
|
|
|---|---|---|---|---|---|---|
|
|
| 1 | `hello` | 768 / 768 | 1.000000 | 1.000000 | 13.441846 | 3.401060 |
|
|
| 2 | `forklift operator with OSHA cert` | 768 / 768 | 1.000000 | 1.000000 | 13.752762 | 1.640122 |
|
|
| 3 | `Need 5 production workers in Aurora IL f` | 768 / 768 | 1.000000 | 1.000000 | 13.882167 | 1.529343 |
|
|
| 4 | `résumé: 12 yrs warehouse — pick/pack` | 768 / 768 | 1.000000 | 1.000000 | 12.507687 | 1.533719 |
|
|
| 5 | `Q: who's available next Friday? A: Bob, ` | 768 / 768 | 1.000000 | 1.000000 | 13.978468 | 1.829930 |
|
|
|
|
## Verdict
|
|
|
|
**PASS** — 5/5 samples ≥ 0.9990 cosine similarity. Gateway plumbing is at-parity for embed.
|
|
|
|
First-flip ready: nginx-side or Bun-side routing of `/ai/embed` to Go's `/v1/embed`
|
|
(with the wire-format remap noted in §Drift below) is safe to attempt.
|
|
|
|
## Drift notes
|
|
|
|
- **URL prefix**: Rust uses `/ai/embed` (nested under `/ai`); Go uses `/v1/embed` (gateway strips `/v1` then forwards to embedd at `:3216/embed`).
|
|
- **Wire format**: Rust returns `{embeddings, model, dimensions}` (plural); Go returns `{vectors, model, dimension}` (singular). A flip needs either a wire-shape adapter on the Go side, or callers updated to handle both shapes.
|
|
- **Default model**: Rust default = `nomic-embed-text` (v1, 137M); Go default = `nomic-embed-text-v2-moe` (v2 MoE, 475M). This probe forces v1 on both to isolate plumbing parity. The v2-moe upgrade is intentional and a separate dimension.
|
|
|
|
## Repro
|
|
|
|
```bash
|
|
cd /home/profit/golangLAKEHOUSE
|
|
./scripts/cutover/embed_parity.sh # default: model=nomic-embed-text
|
|
MODEL=nomic-embed-text-v2-moe ./scripts/cutover/embed_parity.sh # measure embedder drift
|
|
```
|