Enabled lance feature "aws" for S3-compatible storage via opendal.
BucketRegistry: added with_allow_http(true) for MinIO/non-TLS S3
endpoints (fixes "builder error" on HTTP endpoints). lakehouse.toml
gains [[storage.buckets]] name="s3:lakehouse" with S3 backend config.
lance_backend.rs: S3 bucket naming convention — buckets with name
prefix "s3:" emit s3:// URIs for Lance datasets. AWS_* env vars
in the systemd unit provide credentials to Lance's internal
object_store.
Verified end-to-end on real MinIO with real 100K × 768d vectors:
- Migrate Parquet → Lance on S3: 1.7s (vs 0.57s local)
- Build IVF_PQ: 16.4s (CPU-bound, essentially same as local)
- Search: ~58ms p50 (vs 11ms local — S3 partition reads)
- Random doc fetch: 13ms (vs 3.5ms local)
- Recall@10: 0.835 (randomized IVF_PQ, consistent with local 0.805)
- Total S3 footprint: 637 MiB (vectors + index + lance metadata)
The "public storage" claim from the PRD is now proven: the hybrid
Parquet+HNSW ⊕ Lance architecture works on S3-compatible object
storage, not just local filesystem.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>