lakehouse/data/vectors/meta/resumes_100k_v2.json
root d61096e26f 100K embedding COMPLETE: 177/sec, 9.5 min, zero failures
- Supervisor 4-pipeline: 100,000 chunks embedded successfully
- Peak throughput: 177 chunks/sec (4.1x vs single-pipeline 43/sec)
- Total time: 572s (9.5 minutes)
- Storage: 315 MB Parquet
- Brute-force search over 100K vectors: 4.5s
- Index metadata registered: nomic-embed-text, 768d, build stats
- Zero failures — supervisor retry handled all transient errors

Previous attempt (single pipeline): failed at 97K after 38 min
This attempt (supervisor): completed 100K in 9.5 min with retry

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 09:53:47 -05:00

15 lines
399 B
JSON

{
"index_name": "resumes_100k_v2",
"source": "candidates",
"model_name": "nomic-embed-text",
"model_version": "latest",
"dimensions": 768,
"chunk_count": 100000,
"doc_count": 100000,
"chunk_size": 500,
"overlap": 50,
"storage_key": "vectors/resumes_100k_v2.parquet",
"created_at": "2026-03-27T14:52:38.131450260Z",
"build_time_secs": 572.1167,
"chunks_per_sec": 174.78952
}