lakehouse/lakehouse.toml
root 8de94eba08
Some checks failed
lakehouse/auditor 16 blocking issues: cloud: claim not backed — "Verified end-to-end via playwright on devop.live/lakehouse:"
cleanup: bump qwen2.5 → qwen3.5:latest in active defaults
stronger local rung is now the small-model-pipeline tier-1 default
across both Rust legacy + Go rewrite (cf. golangLAKEHOUSE phase 1).
same JSON-clean property as qwen2.5, more capacity. ollama still
serves both side-by-side; rollback is a 4-line revert if a workload
regresses.

active-default sites:
- lakehouse.toml [ai] gen_model + rerank_model → qwen3.5:latest
- mcp-server/observer.ts diagnose call (Phase 44 /v1/chat path) → qwen3.5:latest
- mcp-server/index.ts model roster doc → qwen3.5:latest first
- crates/vectord/src/rag.rs ContinuableOpts + RagResponse.model → qwen3.5:latest

skipped: execution_loop/mod.rs comments describing historic qwen2.5
tool_call quirks — those are documentation of past behavior, not
active defaults. data/_catalog/profiles/*.json are runtime-generated
(gitignored), not in scope for tracked changes.

cargo check -p vectord: clean. no behavioral change in the audit
pipeline — same JSON-clean local model, same think=Some(false)
posture, just stronger upstream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 00:10:57 -05:00

86 lines
2.5 KiB
TOML

# Lakehouse Configuration
[gateway]
host = "0.0.0.0"
port = 3100
[storage]
root = "./data"
profile_root = "./data/_profiles"
rescue_bucket = "rescue"
[[storage.buckets]]
name = "primary"
backend = "local"
root = "./data"
[[storage.buckets]]
name = "rescue"
backend = "local"
root = "./data/_rescue"
[[storage.buckets]]
name = "testing"
backend = "local"
root = "./data/_testing"
# S3 bucket via MinIO. The name "s3:lakehouse" is the convention
# lance_backend.rs uses to emit s3:// URIs for Lance datasets.
# Credentials resolved via environment (AWS_ACCESS_KEY_ID etc) or
# the secrets provider.
[[storage.buckets]]
name = "s3:lakehouse"
backend = "s3"
bucket = "lakehouse"
endpoint = "http://localhost:9000"
region = "us-east-1"
secret_ref = "minio-lakehouse"
[catalog]
# Manifests persisted to object storage under this prefix
manifest_prefix = "_catalog/manifests"
[query]
# max_rows_per_query = 10000
[sidecar]
url = "http://localhost:3200"
[ai]
embed_model = "nomic-embed-text"
# Local-tier defaults bumped 2026-04-30: qwen3.5:latest is the
# stronger local rung in the 5-loop substrate (per
# project_small_model_pipeline_vision.md). Same JSON-clean property
# as qwen2.5, more capacity. Ollama still serves both — bump back
# in this file if a workload regressed.
gen_model = "qwen3.5:latest"
rerank_model = "qwen3.5:latest"
[auth]
enabled = false
# api_key = "changeme"
[observability]
# Export traces to stdout (set to "otlp" for OpenTelemetry collector)
exporter = "stdout"
service_name = "lakehouse"
[agent]
# Phase 16.2 — background autotune agent. Opt-in: set enabled = true to
# let the agent continuously propose + trial HNSW configs and auto-promote
# winners. Defaults are conservative so it stays out of the way of live
# search traffic on shared Ollama.
enabled = true
cycle_interval_secs = 120 # periodic wake if no triggers
cooldown_between_trials_secs = 10 # min gap between trials
min_recall = 0.9 # never promote below this
max_trials_per_hour = 20 # hard budget cap
# Model roster — available for profile hot-swap
# qwen3.5:latest: stronger local rung — JSON-clean, 8K+ context,
# default for gen_model and rerank_model
# qwen3: 8.2B, 40K context, thinking+tools, best for reasoning tasks
# qwen2.5: 7B, 8K context, fast — kept loaded for the 2026-04 era
# comparison runs; new defaults use qwen3.5:latest
# nomic-embed-text: 137M, embedding-only, used by all profiles