# G5 cutover prep — verified-parity log What works on Go gateway, what's been side-by-side compared to Rust, what's safe to flip. Append a row when a new endpoint clears parity. | Endpoint | Date | Rust path | Go path | Verdict | Notes | |---|---|---|---|---|---| | `embed` (forced v1) | 2026-04-30 | `/ai/embed` | `/v1/embed` | ✅ PASS 5/5 cos=1.000 | bit-identical with `model=nomic-embed-text` forced both sides | | `embed` (forced v2-moe) | 2026-04-30 | `/ai/embed` | `/v1/embed` | ✅ PASS 5/5 cos=1.000 | bit-identical with `model=nomic-embed-text-v2-moe` forced both sides — both Ollamas have the model | ## Wire-format drift catalog The Go gateway is *not* a literal nginx-swap drop-in for the Rust gateway. Anything that flips needs a wire-shape adapter. Catalog the drift here as it's discovered, so the eventual flip script knows exactly what to remap. ### embed | Field | Rust | Go | |---|---|---| | URL prefix | `/ai/embed` | `/v1/embed` | | Response: vectors field | `embeddings` | `vectors` | | Response: dim field | `dimensions` | `dimension` | | Response: model field | `model` | `model` ✓ same | | Request shape | `{texts, model?}` | `{texts, model?}` ✓ same | | L2 normalization | unit vectors (‖v‖ ≈ 1.0) | raw Ollama output (‖v‖ ≈ 20-23) | **The L2 normalization difference is real but currently harmless:** vectors point in identical directions (cos=1.000) but Go has raw magnitudes. Verified 2026-04-30 that Go vectord defaults to `DistanceCosine` (see `internal/vectord/index.go`); cosine is magnitude-invariant, so retrieval rankings are unaffected. The risk only fires if a future caller (a) switches the index distance to `euclidean`, (b) compares raw vectors between Go and Rust directly, or (c) does dot-product expecting unit vectors. Adding a normalization step in `internal/embed/embed.go` would make the cutover safer and is cheap — but not blocking. ## Repro ```bash ./scripts/cutover/embed_parity.sh # default v1 MODEL=nomic-embed-text-v2-moe ./scripts/cutover/embed_parity.sh # measure embedder ``` Each run drops a per-date verdict at `reports/cutover/embed_parity_.md`. ## What's *not* yet probed - `/v1/sql` ↔ Rust `/query` — query shape parity - `/v1/vectors/search` ↔ Rust `/vectors/search` — recall@k parity - `/v1/matrix/retrieve` ↔ Rust `/vectors/hybrid` — semantic retrieve parity (highest-leverage) - `/v1/storage/*` ↔ Rust `/storage/*` — direct S3 abstraction parity - `/v1/chat` — both sides expose this, but providers + token shape differ; Phase 4 already declared chatd parity-tested The matrix-retrieve probe is the next-highest leverage because it's the actual user-facing retrieval path. Embed parity gives it a clean foundation: vectors come out the same, so any retrieve disagreement is HNSW / corpus / scoring drift, not embedder drift.