First concrete cutover artifact: scripts/cutover/embed_parity.sh
brings up Go embedd + gateway alongside the live Rust gateway,
hits both /ai/embed and /v1/embed with the same forced model, and
emits a per-date verdict report under reports/cutover/.
Why embed first: the parity invariant is one math identity (cosine
sim of vectors against same input). Retrieve has thousands of edge
cases. If embed parity holds, all downstream vector consumers
inherit confidence; if it doesn't, we catch it in 30s instead of
after a flip.
Verdict 2026-04-30: 5/5 samples cosine=1.000000 with model forced
to nomic-embed-text (v1). Same with nomic-embed-text-v2-moe (both
Ollamas have it loaded). Math is provably equivalent across the
gateway plumbing.
Drift catalog (reports/cutover/SUMMARY.md):
- URL: Rust /ai/embed vs Go /v1/embed
- Wire: Rust {embeddings, dimensions} (plural) vs Go {vectors,
dimension} (singular). Wire-format adapter is the only real
cutover work for this endpoint.
- L2 norm: Rust unit vectors (~1.0); Go raw Ollama (~20-23). Same
direction (cos=1.0); harmless under cosine-distance HNSW (which
is Go vectord's default), but worth fixing in internal/embed/
before extending to euclidean indexes.
reports/cutover/ now tracked (joined the scrum/ + reality-tests/
exemptions in .gitignore).
Next probe: /v1/matrix/retrieve ↔ Rust /vectors/hybrid for the
real user-facing retrieve path. Embed parity gives that probe a
clean foundation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
61 lines
1.4 KiB
Plaintext
61 lines
1.4 KiB
Plaintext
# Go
|
|
*.exe
|
|
*.exe~
|
|
*.dll
|
|
*.so
|
|
*.dylib
|
|
*.test
|
|
*.out
|
|
go.work
|
|
go.work.sum
|
|
vendor/
|
|
|
|
# Build artifacts
|
|
/bin/
|
|
/dist/
|
|
|
|
# Editor / OS
|
|
.DS_Store
|
|
.idea/
|
|
.vscode/
|
|
*.swp
|
|
*~
|
|
|
|
# Local data — these directories follow the Rust lakehouse pattern;
|
|
# regenerated by services on demand. Do not commit runtime artifacts.
|
|
/data/_auditor/
|
|
/data/_kb/
|
|
/data/_pathway_memory/
|
|
/data/_errors/
|
|
/data/_imagecache/
|
|
/data/datasets/
|
|
/data/vectors/
|
|
/data/headshots/
|
|
/data/lance/
|
|
/exports/
|
|
/logs/
|
|
# /reports/ holds runtime artifacts by default (matches Rust lakehouse
|
|
# convention) — but reports/scrum/ is intentional audit documentation.
|
|
# Use /reports/* + un-ignore so git can traverse into reports/.
|
|
/reports/*
|
|
!/reports/scrum/
|
|
!/reports/reality-tests/
|
|
!/reports/cutover/
|
|
# Inside the audit directory, the per-run _evidence/ dump (smoke logs,
|
|
# command output) IS runtime — track the dir, ignore its contents.
|
|
/reports/scrum/_evidence/*
|
|
!/reports/scrum/_evidence/.gitkeep
|
|
# Reality-test JSON evidence is runtime — track the dir + MD reports
|
|
# (committed deliberately as outcome record), ignore per-run JSON.
|
|
/reports/reality-tests/*.json
|
|
|
|
# Proof harness runtime output — same pattern as reports/scrum/_evidence.
|
|
# Track the directory but ignore per-run subdirs.
|
|
/tests/proof/reports/*
|
|
!/tests/proof/reports/.gitkeep
|
|
|
|
# Secrets — never commit. Resolved via SecretsProvider per ADR-001 §1.x.
|
|
*.env
|
|
secrets.toml
|
|
secrets-go.toml
|