lakehouse/data/_catalog/manifests/e959ca11-9f6b-4843-864a-cc3f50a8aa60.json
root b2cd54e941 100K embedding: supervisor achieves 67.6/sec (57% faster than single pipeline)
- 4 parallel pipelines on i9 + A4000 via Ollama
- Previous single-pipeline: 43/sec, 39min for 100K
- Supervisor: 67.6/sec, 22min for 100K
- Previous 100K attempt failed at 97K (no retry) — supervisor handles this
- Checkpointing every 1000 chunks for crash recovery
- Round-robin retry on batch failure (3 attempts)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 09:45:59 -05:00

23 lines
538 B
JSON

{
"id": "e959ca11-9f6b-4843-864a-cc3f50a8aa60",
"name": "email_log",
"schema_fingerprint": "auto",
"objects": [
{
"bucket": "data",
"key": "datasets/email_log.parquet",
"size_bytes": 16768671,
"created_at": "2026-03-27T14:42:49.271082991Z"
}
],
"created_at": "2026-03-27T14:42:49.271091077Z",
"updated_at": "2026-03-27T14:42:49.271091077Z",
"description": "",
"owner": "",
"sensitivity": null,
"columns": [],
"lineage": null,
"freshness": null,
"tags": [],
"row_count": null
}