lakehouse/data/_catalog/manifests/1339f3d6-7677-47fb-8182-5f8e43f27cde.json
root b2cd54e941 100K embedding: supervisor achieves 67.6/sec (57% faster than single pipeline)
- 4 parallel pipelines on i9 + A4000 via Ollama
- Previous single-pipeline: 43/sec, 39min for 100K
- Supervisor: 67.6/sec, 22min for 100K
- Previous 100K attempt failed at 97K (no retry) — supervisor handles this
- Checkpointing every 1000 chunks for crash recovery
- Round-robin retry on batch failure (3 attempts)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 09:45:59 -05:00

23 lines
538 B
JSON

{
"id": "1339f3d6-7677-47fb-8182-5f8e43f27cde",
"name": "job_orders",
"schema_fingerprint": "auto",
"objects": [
{
"bucket": "data",
"key": "datasets/job_orders.parquet",
"size_bytes": 905534,
"created_at": "2026-03-27T14:42:38.935718195Z"
}
],
"created_at": "2026-03-27T14:42:38.935724058Z",
"updated_at": "2026-03-27T14:42:38.935724058Z",
"description": "",
"owner": "",
"sensitivity": null,
"columns": [],
"lineage": null,
"freshness": null,
"tags": [],
"row_count": null
}