lance-bench: also build doc_id btree post-IVF — match gateway's migrate behavior
The bench's own measure_random_access_lance uses take(row_position) — doesn't need the btree. But datasets written by this bench are commonly queried via /vectors/lance/doc/<name>/<doc_id> downstream, and without the btree that path falls back to a full table scan. Building inline keeps bench-produced datasets immediately production-shape and removes a footgun (the same one that made scale_test_10m's doc-fetch ~100ms until commit 5d30b3d fixed it via the migrate handler path). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
5d30b3da89
commit
044650a1da
@ -456,6 +456,26 @@ async fn build_lance_vector_index(path: &str, _dims: usize) -> Result<()> {
|
||||
.await
|
||||
.context("create_index")?;
|
||||
|
||||
// Also build the scalar btree on doc_id. This bench's
|
||||
// measure_random_access_lance uses take(row_position) which doesn't
|
||||
// need the btree, but the dataset this bench writes is also queried
|
||||
// downstream by /vectors/lance/doc/<name>/<doc_id> (the production
|
||||
// lookup path) — without this index that path falls back to a full
|
||||
// table scan. Cheap to build (~1.2s on 10M rows) and matches the
|
||||
// gateway's lance_migrate handler behavior so bench-produced datasets
|
||||
// are immediately production-shape.
|
||||
use lance_index::scalar::ScalarIndexParams;
|
||||
dataset
|
||||
.create_index(
|
||||
&["doc_id"],
|
||||
IndexType::Scalar,
|
||||
Some("doc_id_btree".into()),
|
||||
&ScalarIndexParams::default(),
|
||||
true,
|
||||
)
|
||||
.await
|
||||
.context("create_index doc_id btree")?;
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user