Bridges the missing piece for the staffing co-pilot: text inputs to
vectord-shaped vectors. Standalone cmd/embedd on :3216 fronted by
gateway at /v1/embed. Pluggable embed.Provider interface (G2 ships
Ollama; OpenAI/Voyage swap in via the same interface in G3+).
Wire format:
POST /v1/embed {"texts":[...], "model":"..."} // model optional
→ 200 {"model","dimension","vectors":[[...]]}
Default model: nomic-embed-text (768-d). Ollama returns float64;
provider converts to float32 at the boundary so vectors flow through
vectord/HNSW without re-conversion.
Acceptance smoke 5/5 PASS — including the architectural payoff:
end-to-end embed → vectord add → search by re-embedded text returns
recall=1 at distance 5.96e-8 (float32 precision noise on identical
unit vectors). The staffing co-pilot pipeline (text → vector →
similarity search) is now functional end-to-end.
All 9 smokes (D1-D6 + G1 + G1P + G2) PASS deterministically.
Cross-lineage scrum on shipped code:
- Opus 4.7 (opencode): 0 BLOCK + 4 WARN + 3 INFO
- Kimi K2-0905 (openrouter): 0 BLOCK + 2 WARN + 1 INFO
- Qwen3-coder (openrouter): "No BLOCKs" (3 tokens)
Fixed (2 — 1 convergent + 1 single-reviewer):
C1 (Opus + Kimi convergent WARN): per-text 60s timeout × N-text
batch was up to N×60s with no batch-level cap. One stuck Ollama
call would stall the whole handler indefinitely. Fix:
context.WithTimeout(r.Context(), 60s) wraps the entire batch.
O-W3 (Opus WARN): empty strings in texts went to Ollama unchecked,
producing version-dependent garbage. Fix: reject "" with 400 at
the handler boundary so callers get a deterministic answer
instead of an upstream-conditional 502.
Deferred (4): drainAndClose 64KiB cap (matches G0 pattern), no
concurrency limit on /embed (single-tenant G2), missing Accept
header (exotic-proxy concern), MaxBytesError string-match
redundancy (paranoia layer kept consistent across codebase).
Zero false positives this round — Qwen returned 3 tokens "No BLOCKs"
and the other two reviewers' findings were all real.
Setup confirmed: Ollama 0.21.0 on :11434 with nomic-embed-text loaded.
Per-text /api/embeddings used (forward-compat with 0.21+); newer
0.4+ /api/embed batch endpoint can swap in via the Provider interface.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
60 lines
1.9 KiB
TOML
60 lines
1.9 KiB
TOML
# Lakehouse-Go config — G0 dev defaults. Overrides via env are a
|
||
# G1+ concern; for G0 edit this file and restart the affected service.
|
||
|
||
# G0 dev ports — shifted to 3110+ so the Go services run alongside
|
||
# the live Rust lakehouse on 3100/3201-3204 without colliding. G5
|
||
# (demo cutover) flips gateway back to 3100 when Rust retires.
|
||
[gateway]
|
||
bind = "127.0.0.1:3110"
|
||
storaged_url = "http://127.0.0.1:3211"
|
||
catalogd_url = "http://127.0.0.1:3212"
|
||
ingestd_url = "http://127.0.0.1:3213"
|
||
queryd_url = "http://127.0.0.1:3214"
|
||
vectord_url = "http://127.0.0.1:3215"
|
||
embedd_url = "http://127.0.0.1:3216"
|
||
|
||
[storaged]
|
||
bind = "127.0.0.1:3211"
|
||
|
||
[catalogd]
|
||
bind = "127.0.0.1:3212"
|
||
storaged_url = "http://127.0.0.1:3211"
|
||
|
||
[ingestd]
|
||
bind = "127.0.0.1:3213"
|
||
storaged_url = "http://127.0.0.1:3211"
|
||
catalogd_url = "http://127.0.0.1:3212"
|
||
# CSV uploads are ~4-6× the resulting Parquet. 256 MiB cap keeps the in-memory
|
||
# parse + Arrow + Parquet output footprint bounded. Bump for known large
|
||
# datasets (e.g. workers_500k → 344 MiB CSV needs 512 MiB).
|
||
max_ingest_bytes = 268435456
|
||
|
||
[vectord]
|
||
bind = "127.0.0.1:3215"
|
||
# Optional — set to empty string to disable persistence (dev/test).
|
||
storaged_url = "http://127.0.0.1:3211"
|
||
|
||
[embedd]
|
||
bind = "127.0.0.1:3216"
|
||
# G2: Ollama local. G3+ may swap in OpenAI/Voyage by changing
|
||
# this URL + the wire format inside the provider.
|
||
provider_url = "http://localhost:11434"
|
||
default_model = "nomic-embed-text"
|
||
|
||
[queryd]
|
||
bind = "127.0.0.1:3214"
|
||
catalogd_url = "http://127.0.0.1:3212"
|
||
secrets_path = "/etc/lakehouse/secrets-go.toml"
|
||
refresh_every = "30s"
|
||
|
||
[s3]
|
||
endpoint = "http://localhost:9000"
|
||
region = "us-east-1"
|
||
bucket = "lakehouse-go-primary" # G0 dedicated bucket so Rust + Go coexist
|
||
access_key_id = "" # populated by SecretsProvider from /etc/lakehouse/secrets-go.toml
|
||
secret_access_key = "" # ditto
|
||
use_path_style = true
|
||
|
||
[log]
|
||
level = "info"
|