root
d61096e26f
100K embedding COMPLETE: 177/sec, 9.5 min, zero failures
...
- Supervisor 4-pipeline: 100,000 chunks embedded successfully
- Peak throughput: 177 chunks/sec (4.1x vs single-pipeline 43/sec)
- Total time: 572s (9.5 minutes)
- Storage: 315 MB Parquet
- Brute-force search over 100K vectors: 4.5s
- Index metadata registered: nomic-embed-text, 768d, build stats
- Zero failures — supervisor retry handled all transient errors
Previous attempt (single pipeline): failed at 97K after 38 min
This attempt (supervisor): completed 100K in 9.5 min with retry
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 09:53:47 -05:00
root
6df904a03c
Phase 8: Hot cache + incremental delta updates
...
- MemCache: LRU in-memory cache for hot datasets (configurable max, default 16GB)
Pin/evict/stats endpoints: POST /query/cache/pin, /cache/evict, GET /cache/stats
- Delta store: append-only delta Parquet files for row-level updates
Write deltas without rewriting base files, merge at query time
- Compaction: POST /query/compact merges deltas into base Parquet
- Query engine: checks cache first, falls back to Parquet, merges deltas
- Benchmarked on 2.47M rows:
1M row JOIN: 854ms cold → 96ms hot (8.9x speedup)
100K filter: 62ms cold → 21ms hot (3x speedup)
1.1M rows cached in 408MB RAM
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 08:37:28 -05:00
root
eae51977ab
Scale test: 2.47M rows + 10K vector index benchmarked
...
Benchmarks on 128GB RAM server:
- 100K candidate filter (skills+city+status): 257ms
- 1M timesheet aggregation (revenue by client): 942ms
- 800K call log cross-reference (cold leads): 642ms
- Triple JOIN recruiter performance: 487ms
- 500K email open rate aggregation: 259ms
- COUNT all 2.47M rows: 84ms
- 10K vector search (cosine similarity): ~450ms
- Embedding throughput: 49 chunks/sec via Ollama
- RAG correctly refuses to hallucinate when no match exists
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 08:31:37 -05:00
root
26fc98c885
Phase 7: Vector index + RAG pipeline
...
- vectord crate: chunk → embed → store → search → RAG
- chunker: configurable chunk size + overlap, sentence-boundary aware splitting
- store: embeddings as Parquet (binary blob f32 vectors), portable format
- search: brute-force cosine similarity (works up to ~100K vectors)
- rag: full pipeline — embed question → search index → retrieve context → LLM answer
- Endpoints: POST /vectors/index, /vectors/search, /vectors/rag
- Gateway wired with vectord service
- Tested: 200 candidate resumes indexed in 5.4s, semantic search + RAG working
- 20 unit tests passing (chunker, search, ingestd, shared)
- AI gives honest "no match found" when context doesn't support an answer
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 08:12:28 -05:00