2026-04-17 01:06:09 ═══ Scale test heartbeat: step= ═══ 2026-04-17 01:06:09 Unknown state: . Resetting to start. 2026-04-17 01:06:09 Heartbeat done. Next step: start 2026-04-17 01:06:21 ═══ Scale test heartbeat: step=start ═══ 2026-04-17 01:06:21 Step 1: Registering 10M vector index in catalog... 2026-04-17 01:06:21 Parquet exists: 29G 2026-04-17 01:06:21 Heartbeat done. Next step: migrate_lance 2026-04-17 01:08:01 ═══ Scale test heartbeat: step=migrate_lance ═══ ═══ Scale test heartbeat: step=migrate_lance ═══ 2026-04-17 01:08:01 Step 2: Migrating 10M vectors Parquet → Lance... Step 2: Migrating 10M vectors Parquet → Lance... 2026-04-17 01:08:01 This will take several minutes for 28.8 GB... This will take several minutes for 28.8 GB... 2026-04-17 01:08:01 Migration via API needs index registered. Using direct Lance path... Migration via API needs index registered. Using direct Lance path... Lance migration needs to read 28.8GB Parquet — this takes time... Starting migration... Error: HTTP Error 404: Not Found Attempting direct Lance write... 2026-04-17 01:08:01 Heartbeat done. Next step: check_lance Heartbeat done. Next step: check_lance error: externally-managed-environment × This environment is externally managed ╰─> To install Python packages system-wide, try apt install python3-xyz, where xyz is the package you are trying to install. If you wish to install a non-Debian-packaged Python package, create a virtual environment using python3 -m venv path/to/venv. Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make sure you have python3-full installed. If you wish to install a non-Debian packaged Python application, it may be easiest to use pipx install xyz, which will manage a virtual environment for you. Make sure you have pipx installed. See /usr/share/doc/python3.13/README.venv for more information. note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages. hint: See PEP 668 for the detailed specification. Traceback (most recent call last): File "", line 11, in import lance ModuleNotFoundError: No module named 'lance' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "", line 19, in import lance ModuleNotFoundError: No module named 'lance' Missing dep: No module named 'lance' Installing lance... 2026-04-17 01:10:01 ═══ Scale test heartbeat: step=check_lance ═══ ═══ Scale test heartbeat: step=check_lance ═══ 2026-04-17 01:10:01 Step 2b: Checking Lance dataset status... Step 2b: Checking Lance dataset status... 2026-04-17 01:10:02 Lance dataset not ready yet. Will retry on next heartbeat. Lance dataset not ready yet. Will retry on next heartbeat. 2026-04-17 01:10:02 Heartbeat done. Next step: check_lance Heartbeat done. Next step: check_lance 2026-04-17 01:12:01 ═══ Scale test heartbeat: step=check_lance ═══ ═══ Scale test heartbeat: step=check_lance ═══ 2026-04-17 01:12:01 Step 2b: Checking Lance dataset status... Step 2b: Checking Lance dataset status... 2026-04-17 01:12:01 Lance dataset: 8000000 rows Lance dataset: 8000000 rows 2026-04-17 01:12:01 Heartbeat done. Next step: build_index Heartbeat done. Next step: build_index Migrating 10,000,000 vectors to Lance... 500,000 / 10,000,000 (117,052/sec ETA 81s) 1,000,000 / 10,000,000 (121,674/sec ETA 74s) 1,500,000 / 10,000,000 (123,846/sec ETA 69s) 2,000,000 / 10,000,000 (124,296/sec ETA 64s) 2,500,000 / 10,000,000 (124,056/sec ETA 60s) 3,000,000 / 10,000,000 (124,131/sec ETA 56s) 3,500,000 / 10,000,000 (124,769/sec ETA 52s) 4,000,000 / 10,000,000 (125,028/sec ETA 48s) 4,500,000 / 10,000,000 (125,375/sec ETA 44s) 5,000,000 / 10,000,000 (125,476/sec ETA 40s) 5,500,000 / 10,000,000 (125,140/sec ETA 36s) 6,000,000 / 10,000,000 (124,899/sec ETA 32s) 6,500,000 / 10,000,000 (124,355/sec ETA 28s) 7,000,000 / 10,000,000 (123,762/sec ETA 24s) 7,500,000 / 10,000,000 (123,050/sec ETA 20s) 8,000,000 / 10,000,000 (122,744/sec ETA 16s) 8,500,000 / 10,000,000 (122,164/sec ETA 12s) 9,000,000 / 10,000,000 (121,839/sec ETA 8s) 9,500,000 / 10,000,000 (121,655/sec ETA 4s) 10,000,000 / 10,000,000 (121,529/sec ETA 0s) Done: 10,000,000 rows in 82s Verified: 10,000,000 rows in Lance 2026-04-17 01:14:01 ═══ Scale test heartbeat: step=build_index ═══ ═══ Scale test heartbeat: step=build_index ═══ 2026-04-17 01:14:01 Step 3: Building IVF_PQ index on 10M Lance dataset... Step 3: Building IVF_PQ index on 10M Lance dataset... 2026-04-17 01:14:01 Using tuned config: 3162 partitions (√10M), 8 bits, 192 sub_vectors Using tuned config: 3162 partitions (√10M), 8 bits, 192 sub_vectors Fri Apr 17 01:16:01 AM CDT 2026 Already running (pid 957071) [2026-04-17T06:16:02Z WARN lance::index::vector::builder] partition 2174 is empty, skipping Dataset: 10,000,000 rows Building IVF_PQ: 3162 partitions, 8 bits, 192 sub_vectors... Index built in 173s === Search benchmark: 10 queries on 10M vectors === First query: 19ms, 10 hits Top hit: VEC-2662261 p50=5ms p95=19ms avg=6ms All 10 searches completed on 10M vectors ═══════════════════════════════════════════════════════════ 10M VECTOR SCALE TEST — RESULTS ═══════════════════════════════════════════════════════════ Vectors: 10,000,000 Dimensions: 768 Storage: 30 GB (Lance on disk) IVF_PQ build: 173 seconds (3162 partitions, 192 sub_vectors) Search p50: 5ms Search p95: 19ms HNSW at 10M would need: 29 GB RAM (past ceiling) Lance at 10M: 30 GB disk, 5ms search THIS IS THE PROOF: Lance handles what HNSW can't. ═══════════════════════════════════════════════════════════