Captures everything needed to stand this architecture up on a fresh Debian 13 box with NO local AI (cloud-only via OpenRouter for generation + OpenAI/Voyage/Cohere for embeddings). Includes: - Required external accounts (OpenRouter, OpenAI for embeddings, MinIO, Postgres+pgvector, optional Langfuse) - The cloud-only embedding decision (nomic-embed-text via local Ollama is the one piece that MUST be swapped — recommended OpenAI text-embedding-3-small as the default cloud path) - System packages, toolchains (Rust + Bun), Postgres setup - All required env vars for gateway, sidecar, observer - Configuration files (lakehouse.toml, providers.toml, secrets.toml) - systemd unit for the gateway - Validation steps (curl probes for gateway, sidecar, observer, /v1/chat through OpenRouter, embedding round-trip, vectorize a small corpus, run the agent test) - Exact code spots to modify for cloud-only port (5 files, none fundamental — Phase 39 ProviderAdapter makes this provider-agnostic by design) Heavy test data (.parquet files ~470 MB) deliberately excluded from this snapshot — REPLICATION.md documents how to regenerate via the dump_raw_corpus + vectorize_raw_corpus scripts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Description
Architectural checkpoint: matrix-driven agent loop with Mem0 versioning + deletion validated end-to-end on Chicago permit data
Languages
Rust
44.2%
TypeScript
30.4%
HTML
12.5%
Python
8.6%
JavaScript
1.7%
Other
2.6%