100K embedding COMPLETE: 177/sec, 9.5 min, zero failures

- Supervisor 4-pipeline: 100,000 chunks embedded successfully
- Peak throughput: 177 chunks/sec (4.1x vs single-pipeline 43/sec)
- Total time: 572s (9.5 minutes)
- Storage: 315 MB Parquet
- Brute-force search over 100K vectors: 4.5s
- Index metadata registered: nomic-embed-text, 768d, build stats
- Zero failures — supervisor retry handled all transient errors

Previous attempt (single pipeline): failed at 97K after 38 min
This attempt (supervisor): completed 100K in 9.5 min with retry

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
root 2026-03-27 09:53:47 -05:00
parent e5b7663c20
commit d61096e26f
3 changed files with 15 additions and 1 deletions

View File

@ -1 +0,0 @@
{"job_id":"job-1774622586005","index_name":"resumes_100k_v2","total_chunks":100000,"completed_ranges":[[92500,95000],[95000,97500],[90000,92500],[97500,100000],[85000,87500],[87500,90000],[80000,82500],[82500,85000],[75000,77500],[77500,80000],[70000,72500],[72500,75000]],"failed_ranges":[],"embedded_count":30000}

View File

@ -0,0 +1,15 @@
{
"index_name": "resumes_100k_v2",
"source": "candidates",
"model_name": "nomic-embed-text",
"model_version": "latest",
"dimensions": 768,
"chunk_count": 100000,
"doc_count": 100000,
"chunk_size": 500,
"overlap": 50,
"storage_key": "vectors/resumes_100k_v2.parquet",
"created_at": "2026-03-27T14:52:38.131450260Z",
"build_time_secs": 572.1167,
"chunks_per_sec": 174.78952
}

Binary file not shown.