2 Commits

Author SHA1 Message Date
root
5368aca4d4 docs: sync ADR-019 + PRD + DECISIONS with 2026-05-02 substrate changes
ADR-019: closed the "re-bench when 10M corpus exists" follow-up. Added
"Follow-up: 10M re-bench (2026-05-02)" section with the post-fix numbers
(search ~20ms warm / ~46ms cold, doc-fetch ~5ms post-btree). Documented
the lance-bench-bypassing-IndexMeta bug + 2-layer fix + gauntlet
(7 unit + 12 sanitize + 10 smoke probes). Reframes the strategic
question as "Lance vs Parquet+HNSW-with-spilling" since HNSW doesn't
fit RAM at 10M.

DECISIONS: added ADR-022 — drop Python sidecar from Rust hot path.
Captures the rationale (236× embed perf gap was pure overhead),
co-shipped LRU cache, dev-only Python that survives, cross-runtime
parity verification, and the operator runbook signal (ps -ef ABSENT
post-deploy).

PRD: updated AI Boundary table line + aibridge crate description to
reflect direct Ollama path (was: Python FastAPI sidecar → Ollama).
Both lines reference ADR-022 for the full rationale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 00:44:57 -05:00
root
76f6fba5de Phase B: Lance pilot — hybrid decision with measured benchmark
Standalone benchmark crate `crates/lance-bench` running Lance 4.0 against
our Parquet+HNSW at 100K × 768d (resumes_100k_v2) measured 8 dimensions.

Results (see docs/ADR-019-vector-storage.md for full scorecard):

  Cold load:        Parquet 0.17s   vs Lance 0.13s   (tie — not ≥2× threshold)
  Disk size:        330.3 MB        vs 330.4 MB      (tie)
  Search p50:       873us           vs 2229us        (Parquet 2.55× faster)
  Search p95:       1413us          vs 4998us        (Parquet 3.54× faster)
  Index build:      230s (ec=80)    vs 16s (IVF_PQ)  (Lance 14× faster)
  Random access:    35ms (scan)     vs 311us         (Lance 112× faster)
  Append 10K rows:  full rewrite    vs 0.08s/+31MB   (Lance structural win)

Decision (ADR-019): hybrid, not migrate-or-reject.

- Parquet+HNSW stays primary — our HNSW at ec=80 es=30 recall=1.00 is
  2.55× faster than Lance IVF_PQ at 100K in-RAM scale
- Lance joins as second backend per-profile for workloads where it wins
  architecturally: random row access (RAG text fetch), append-heavy
  pipelines (Phase C), hot-swap generations (Phase 16, 14× faster
  builds), and indexes past the ~5M RAM ceiling
- Phase 17 ModelProfile gets vector_backend: Parquet | Lance field
- Ceiling table in PRD updated — 5M ceiling now says "switch to Lance"
  instead of "migrate" since Lance runs alongside, not instead of

Isolation: lance-bench is a standalone workspace crate with its own dep
tree (Lance pulls DataFusion 52 + Arrow 57 incompatible with main stack
DataFusion 47 + Arrow 55). Kept off the critical path until API is
stable enough to promote into vectord::lance_store.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 02:37:11 -05:00