lakehouse/data/_catalog/manifests/0fd78303-9ad4-45fd-90d7-db95607d9ab1.json
root b2cd54e941 100K embedding: supervisor achieves 67.6/sec (57% faster than single pipeline)
- 4 parallel pipelines on i9 + A4000 via Ollama
- Previous single-pipeline: 43/sec, 39min for 100K
- Supervisor: 67.6/sec, 22min for 100K
- Previous 100K attempt failed at 97K (no retry) — supervisor handles this
- Checkpointing every 1000 chunks for crash recovery
- Round-robin retry on batch failure (3 attempts)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 09:45:59 -05:00

23 lines
540 B
JSON

{
"id": "0fd78303-9ad4-45fd-90d7-db95607d9ab1",
"name": "timesheets",
"schema_fingerprint": "auto",
"objects": [
{
"bucket": "data",
"key": "datasets/timesheets.parquet",
"size_bytes": 17539932,
"created_at": "2026-03-27T14:42:43.922019299Z"
}
],
"created_at": "2026-03-27T14:42:43.922025703Z",
"updated_at": "2026-03-27T14:42:43.922025703Z",
"description": "",
"owner": "",
"sensitivity": null,
"columns": [],
"lineage": null,
"freshness": null,
"tags": [],
"row_count": null
}