lakehouse/data/_catalog/manifests/d8170213-d6af-4478-ae23-59f06fda3165.json
root 6a532cb248 Background job system for embedding — fixes 100K timeout
- JobTracker: create/update/complete/fail jobs with progress tracking
- POST /vectors/index now returns immediately with job_id (HTTP 202)
- Embedding runs in tokio::spawn background task
- GET /vectors/jobs/{id} returns live progress (chunks embedded, rate, ETA)
- GET /vectors/jobs lists all jobs
- Progress logged every 100 batches with chunks/sec and ETA
- 100K embedding job running successfully at 44 chunks/sec
- System stays responsive during embedding (queries in 23ms)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 09:03:07 -05:00

15 lines
387 B
JSON

{
"id": "d8170213-d6af-4478-ae23-59f06fda3165",
"name": "job_orders",
"schema_fingerprint": "auto",
"objects": [
{
"bucket": "data",
"key": "datasets/job_orders.parquet",
"size_bytes": 905534,
"created_at": "2026-03-27T14:00:35.780022147Z"
}
],
"created_at": "2026-03-27T14:00:35.780029168Z",
"updated_at": "2026-03-27T14:00:35.780029168Z"
}