root
|
21e8015b60
|
Phase 37: Hot-swap async + Phase 38: Universal API skeleton
- JobTracker extended with JobType::ProfileActivation + Embed
- activate_profile returns job_id immediately, work spawns in background
- /v1/chat, /v1/usage, /v1/sessions endpoints (OpenAI-compatible)
- Langfuse trace integration (Phase 40 early deliverable)
- 12 gateway unit tests green, curl gates pass
|
2026-04-23 01:56:17 -05:00 |
|
root
|
3b695cd592
|
Dual-pipeline supervisor for embedding ingestion
- 4 parallel pipelines (tuned for i9 + A4000)
- Range-based work splitting (2500 chunks per range)
- Round-robin retry on failure (3 attempts before dead-letter)
- Checkpointing to disk every 1000 chunks (crash recovery)
- On restart, loads checkpoint and skips completed ranges
- Dead-letter queue for permanently failed ranges
- Vectors assembled in order after all pipelines finish
- Batch size 64 for GPU throughput
Architecture:
Supervisor → splits 100K chunks into 40 ranges
├── Pipeline 0: grabs range, embeds, reports progress
├── Pipeline 1: grabs range, embeds, reports progress
├── Pipeline 2: grabs range, embeds, reports progress
└── Pipeline 3: grabs range, embeds, reports progress
Failed range → back to queue → next available pipeline retries
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-03-27 09:06:28 -05:00 |
|