Five threads of work landing as one milestone — all individually
verified end-to-end against real data, full release build clean,
46 unit tests pass.
## Phase 16.2 / 16.5 — autotune agent + ingest triggers
`vectord::agent` is a long-running tokio task that watches the trial
journal and autonomously proposes + runs new HNSW configs. Distinct
from `autotune::run_autotune` (synchronous one-shot grid). Triggered
on POST /vectors/agent/enqueue/{idx} or by the periodic wake; ingest
paths now push DatasetAppended events when an index's source dataset
gets re-ingested. Rate-limited (max_trials_per_hour) and cooldown-
gated so it can't saturate Ollama under live load.
The proposer is ε-greedy around the current champion: with prob 0.25
sample random from full bounds, otherwise perturb champion ± small
delta on both axes. Dedup against history. Deterministic — RNG seeded
from history.len() so the same journal state proposes the same next
config (helps offline replay debugging).
`[agent]` config section in lakehouse.toml; opt-in via enabled=true.
## Federation Layer 2 — runtime bucket lifecycle + per-index scoping
`BucketRegistry.buckets` moved to `std::sync::RwLock<HashMap>` so
buckets can be added/removed after startup. POST /storage/buckets
provisions at runtime; DELETE /storage/buckets/{name} unregisters
(refuses primary/rescue with 403). Local-backend buckets get their
root directory auto-created.
`IndexMeta.bucket` (default "primary" via serde) records each index's
home bucket. `TrialJournal` and `PromotionRegistry` now hold
Arc<BucketRegistry> + IndexRegistry; they resolve target store per-
index via IndexMeta.bucket. PromotionRegistry::list_all scans every
bucket and dedups by index_name. Pre-federation indexes keep working
unchanged — they just default to primary.
`ModelProfile.bucket: Option<String>` declares per-profile artifact
home. POST /vectors/profile/{id}/activate auto-provisions the
profile's bucket under storage.profile_root if not yet registered.
EvalSets stay primary-only for now — noted gap, low-risk to extend
later with the same resolver pattern.
## Phase 17 — VRAM-aware two-profile gate
Sidecar gains POST /admin/unload (Ollama keep_alive=0 trick — forces
immediate VRAM release), POST /admin/preload (keep_alive=5m with
empty prompt, takes the slot warm), and GET /admin/vram (combines
nvidia-smi snapshot with Ollama /api/ps). Exposed via aibridge as
unload_model / preload_model / vram_snapshot.
`VectorState.active_profile` is the GPU-slot singleton —
Arc<RwLock<Option<ActiveProfileSlot>>>. activate_profile checks for
a previous profile with a different ollama_name and unloads it
before preloading the new one; same-model reactivations skip the
unload (Ollama no-ops). New routes: POST /vectors/profile/{id}/
deactivate (unload + clear slot), GET /vectors/profile/active.
Verified live: staffing-recruiter (qwen2.5) → docs-assistant
(mistral) swap freed qwen2.5 from VRAM and loaded mistral. nomic-
embed-text persists across swaps because both profiles use it —
free optimization that fell out of the design. Scoped search
correctly 403s cross-profile in both directions.
## MySQL streaming connector
`crates/ingestd/src/my_stream.rs` mirrors pg_stream.rs for MySQL.
Pure-rust `mysql_async` driver (default-features=false to avoid C
deps). Same OFFSET pagination, same Parquet-streaming write shape.
Type mapping per ADR-010: int/bigint → Int32/Int64, decimal/float
→ Float64, tinyint(1)/bool → Boolean, everything else → Utf8 with
fallback parsers for date/time/json/uuid via Display.
POST /ingest/mysql parallel to /ingest/db. Same PII auto-detection,
same lineage capture (source_system="mysql"), same agent-trigger
hook. `redact_dsn` generalized — was hardcoded to "postgresql://"
length, now works for any scheme://user:pass@host/path URL (latent
PII leak fix for MySQL DSNs).
Verified live against MariaDB on localhost: 10 rows × 9 columns of
test data round-tripped through datatypes int/varchar/decimal/
tinyint/datetime/text. PII detection auto-flagged name + email.
Aggregation queries through DataFusion match the source values
exactly.
## Phase 18 — Hybrid Parquet+HNSW ⊕ Lance backend (ADR-019)
`vectord-lance` is a new firewall crate. Lance pulls Arrow 57 and
DataFusion 52 — incompatible with the rest of the workspace's
Arrow 55 / DataFusion 47. The firewall isolates that dep tree:
public API uses only std types (Vec<f32>, Vec<String>, Hit, Row,
*Stats), so no Arrow types cross the crate boundary and nothing
propagates to vectord. The ADR-019 path that didn't ship until now.
`vectord::lance_backend::LanceRegistry` lazy-creates a
LanceVectorStore per index, resolving bucket → URI via the
conventional local-bucket layout. `IndexMeta.vector_backend` and
`ModelProfile.vector_backend` carry the choice (default Parquet so
existing indexes unchanged).
Six routes under /vectors/lance/*:
- migrate/{idx}: convert binary-blob Parquet → Lance FixedSizeList
- index/{idx}: build IVF_PQ
- search/{idx}: vector search (embed via sidecar)
- doc/{idx}/{doc_id}: random row fetch
- append/{idx}: native fragment append
- stats/{idx}: row count + index presence
Verified live on the real resumes_100k_v2 corpus (100K × 768d):
- Migrate: 0.57s
- Build IVF_PQ index: 16.2s (matches ADR-019 bench; 14× faster than
HNSW's 230s for the same data)
- Search end-to-end (Ollama embed + Lance scan): 23-53ms
- Random doc_id fetch: 5-7ms (filter scan; faster than Parquet's
~35ms full-file scan, slower than the bench's 311us positional
take — would close that gap with a scalar btree on doc_id)
- Append 100 rows: 3.3ms / +320KB on disk vs Parquet's required
full ~330MB rewrite — the structural win
- Index survives append; both backends coexist cleanly
## Known follow-ups not in this milestone
- ModelProfile.vector_backend doesn't yet auto-route /vectors/profile/
{id}/search to Lance; callers go through /vectors/lance/* directly
- Scalar btree on doc_id (closes the 5-7ms → ~300us gap)
- vectord-lance built default-features=false → no S3 yet
- IVF_PQ recall not measured (ADR-019 caveat) — needs a Lance-aware
variant of the eval harness
- Watcher-path ingest doesn't push agent triggers (HTTP paths do)
- EvalSets still primary-only (federation gap)
- No PATCH endpoint to move an existing index between buckets
- The pre-existing storaged::append_log doctest fails to compile
(malformed `{prefix}/` parses as code fence) — pre-existing bug,
left for a focused fix
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
204 lines
6.5 KiB
Rust
204 lines
6.5 KiB
Rust
use serde::Deserialize;
|
|
use std::path::Path;
|
|
|
|
#[derive(Debug, Clone, Deserialize)]
|
|
pub struct Config {
|
|
pub gateway: GatewayConfig,
|
|
pub storage: StorageConfig,
|
|
#[serde(default)]
|
|
pub catalog: CatalogConfig,
|
|
#[serde(default)]
|
|
pub query: QueryConfig,
|
|
pub sidecar: SidecarConfig,
|
|
#[serde(default)]
|
|
pub ai: AiConfig,
|
|
#[serde(default)]
|
|
pub auth: AuthConfig,
|
|
#[serde(default)]
|
|
pub observability: ObservabilityConfig,
|
|
#[serde(default)]
|
|
pub agent: AgentSettings,
|
|
}
|
|
|
|
/// Phase 16.2 — background autotune agent settings.
|
|
///
|
|
/// Duplicated from `vectord::agent::AgentConfig` because `shared` can't
|
|
/// depend on `vectord` (vectord already depends on shared). The gateway
|
|
/// copies these into the vectord config at startup.
|
|
#[derive(Debug, Clone, Deserialize)]
|
|
pub struct AgentSettings {
|
|
#[serde(default)]
|
|
pub enabled: bool,
|
|
#[serde(default = "default_cycle_interval_secs")]
|
|
pub cycle_interval_secs: u64,
|
|
#[serde(default = "default_cooldown_secs")]
|
|
pub cooldown_between_trials_secs: u64,
|
|
#[serde(default = "default_min_recall")]
|
|
pub min_recall: f32,
|
|
#[serde(default = "default_max_trials_per_hour")]
|
|
pub max_trials_per_hour: u32,
|
|
}
|
|
|
|
impl Default for AgentSettings {
|
|
fn default() -> Self {
|
|
Self {
|
|
enabled: false,
|
|
cycle_interval_secs: default_cycle_interval_secs(),
|
|
cooldown_between_trials_secs: default_cooldown_secs(),
|
|
min_recall: default_min_recall(),
|
|
max_trials_per_hour: default_max_trials_per_hour(),
|
|
}
|
|
}
|
|
}
|
|
|
|
fn default_cycle_interval_secs() -> u64 { 60 }
|
|
fn default_cooldown_secs() -> u64 { 30 }
|
|
fn default_min_recall() -> f32 { 0.9 }
|
|
fn default_max_trials_per_hour() -> u32 { 30 }
|
|
|
|
#[derive(Debug, Clone, Deserialize)]
|
|
pub struct GatewayConfig {
|
|
#[serde(default = "default_host")]
|
|
pub host: String,
|
|
#[serde(default = "default_gateway_port")]
|
|
pub port: u16,
|
|
}
|
|
|
|
#[derive(Debug, Clone, Deserialize)]
|
|
pub struct StorageConfig {
|
|
/// Legacy single-backend root. If `buckets` is empty, this is used to
|
|
/// create an implicit `primary` bucket at this path — preserves the
|
|
/// pre-federation config shape.
|
|
#[serde(default = "default_storage_root")]
|
|
pub root: String,
|
|
|
|
/// Where profile buckets are rooted when auto-provisioned.
|
|
#[serde(default = "default_profile_root")]
|
|
pub profile_root: String,
|
|
|
|
/// Name of the bucket used for read fallback when a target bucket is
|
|
/// unreachable. If `None`, no fallback — reads fail hard.
|
|
#[serde(default)]
|
|
pub rescue_bucket: Option<String>,
|
|
|
|
/// Explicitly configured buckets. Empty = backward-compat single-bucket
|
|
/// mode driven by `root`.
|
|
#[serde(default)]
|
|
pub buckets: Vec<BucketConfig>,
|
|
}
|
|
|
|
#[derive(Debug, Clone, Deserialize)]
|
|
pub struct BucketConfig {
|
|
pub name: String,
|
|
pub backend: String, // "local" | "s3"
|
|
/// Local filesystem root (for backend = "local")
|
|
pub root: Option<String>,
|
|
/// S3 bucket name (for backend = "s3")
|
|
pub bucket: Option<String>,
|
|
pub region: Option<String>,
|
|
pub endpoint: Option<String>,
|
|
/// Handle for the secrets provider — never the literal credential.
|
|
pub secret_ref: Option<String>,
|
|
}
|
|
|
|
#[derive(Debug, Clone, Deserialize, Default)]
|
|
pub struct CatalogConfig {
|
|
#[serde(default = "default_manifest_prefix")]
|
|
pub manifest_prefix: String,
|
|
}
|
|
|
|
#[derive(Debug, Clone, Deserialize, Default)]
|
|
pub struct QueryConfig {
|
|
pub max_rows_per_query: Option<usize>,
|
|
}
|
|
|
|
#[derive(Debug, Clone, Deserialize)]
|
|
pub struct SidecarConfig {
|
|
#[serde(default = "default_sidecar_url")]
|
|
pub url: String,
|
|
}
|
|
|
|
#[derive(Debug, Clone, Deserialize, Default)]
|
|
pub struct AiConfig {
|
|
#[serde(default = "default_embed_model")]
|
|
pub embed_model: String,
|
|
#[serde(default = "default_gen_model")]
|
|
pub gen_model: String,
|
|
#[serde(default = "default_rerank_model")]
|
|
pub rerank_model: String,
|
|
}
|
|
|
|
#[derive(Debug, Clone, Deserialize, Default)]
|
|
pub struct AuthConfig {
|
|
#[serde(default)]
|
|
pub enabled: bool,
|
|
pub api_key: Option<String>,
|
|
}
|
|
|
|
#[derive(Debug, Clone, Deserialize, Default)]
|
|
pub struct ObservabilityConfig {
|
|
#[serde(default = "default_exporter")]
|
|
pub exporter: String,
|
|
#[serde(default = "default_service_name")]
|
|
pub service_name: String,
|
|
}
|
|
|
|
// Defaults
|
|
fn default_host() -> String { "0.0.0.0".to_string() }
|
|
fn default_gateway_port() -> u16 { 3100 }
|
|
fn default_storage_root() -> String { "./data".to_string() }
|
|
fn default_profile_root() -> String { "./data/_profiles".to_string() }
|
|
fn default_manifest_prefix() -> String { "_catalog/manifests".to_string() }
|
|
fn default_sidecar_url() -> String { "http://localhost:3200".to_string() }
|
|
fn default_embed_model() -> String { "nomic-embed-text".to_string() }
|
|
fn default_gen_model() -> String { "qwen2.5".to_string() }
|
|
fn default_rerank_model() -> String { "qwen2.5".to_string() }
|
|
fn default_exporter() -> String { "stdout".to_string() }
|
|
fn default_service_name() -> String { "lakehouse".to_string() }
|
|
|
|
impl Config {
|
|
pub fn load(path: &str) -> Result<Self, String> {
|
|
let path = Path::new(path);
|
|
if !path.exists() {
|
|
return Err(format!("config file not found: {}", path.display()));
|
|
}
|
|
let content = std::fs::read_to_string(path)
|
|
.map_err(|e| format!("failed to read config: {e}"))?;
|
|
toml::from_str(&content)
|
|
.map_err(|e| format!("failed to parse config: {e}"))
|
|
}
|
|
|
|
pub fn load_or_default() -> Self {
|
|
// Try lakehouse.toml in current dir, then /etc/lakehouse/lakehouse.toml
|
|
for path in &["lakehouse.toml", "/etc/lakehouse/lakehouse.toml"] {
|
|
if let Ok(config) = Self::load(path) {
|
|
tracing::info!("loaded config from {path}");
|
|
return config;
|
|
}
|
|
}
|
|
tracing::warn!("no config file found, using defaults");
|
|
Self::default()
|
|
}
|
|
}
|
|
|
|
impl Default for Config {
|
|
fn default() -> Self {
|
|
Self {
|
|
gateway: GatewayConfig { host: default_host(), port: default_gateway_port() },
|
|
storage: StorageConfig {
|
|
root: default_storage_root(),
|
|
profile_root: default_profile_root(),
|
|
rescue_bucket: None,
|
|
buckets: Vec::new(),
|
|
},
|
|
catalog: CatalogConfig::default(),
|
|
query: QueryConfig::default(),
|
|
sidecar: SidecarConfig { url: default_sidecar_url() },
|
|
ai: AiConfig::default(),
|
|
auth: AuthConfig::default(),
|
|
observability: ObservabilityConfig::default(),
|
|
agent: AgentSettings::default(),
|
|
}
|
|
}
|
|
}
|