lakehouse/crates/vectord/Cargo.toml
root 352f99de0f Hybrid SQL+Vector search — the gap is closed
POST /vectors/hybrid takes a question + SQL WHERE clause. Pipeline:
1. SQL filter narrows to structurally-valid candidates (role, state,
   reliability, certs — whatever the caller specifies)
2. Brute-force cosine scores ALL embeddings (not HNSW, which caps at
   ~30 results due to ef_search — too few to intersect with narrow
   SQL filters on 10K+ datasets)
3. Filter vector results to only SQL-verified IDs
4. LLM generates answer from verified-correct records

Tested on the exact query that failed the staffing simulation:
"forklift operators in IL with reliability > 0.8" — SQL found 78
matches, vector ranked the 5 most semantically relevant, LLM
generated an answer citing real workers with actual skills and
certifications. Every source marked sql_verified=true.

This closes the architectural gap identified by the quality eval:
structured precision (SQL) + semantic intelligence (vector) in one
endpoint. The simulation's contract-matching path was already
SQL-pure and worked perfectly; now the intelligence-question path
has the same accuracy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 22:49:48 -05:00

27 lines
818 B
TOML

[package]
name = "vectord"
version = "0.1.0"
edition = "2024"
[dependencies]
shared = { path = "../shared" }
storaged = { path = "../storaged" }
aibridge = { path = "../aibridge" }
catalogd = { path = "../catalogd" }
queryd = { path = "../queryd" }
# ADR-019 firewall — vectord-lance owns its own Arrow 57 / Lance 4 deps.
# Public API uses only std types so no version conflict propagates here.
vectord-lance = { path = "../vectord-lance" }
tokio = { workspace = true }
axum = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
tracing = { workspace = true }
bytes = { workspace = true }
object_store = { workspace = true }
parquet = { workspace = true }
arrow = { workspace = true }
chrono = { workspace = true }
instant-distance = { workspace = true }
uuid = { workspace = true }