Replaces the per-item Add loop in the HTTP handler with one call to
Index.BatchAdd, which acquires the write-lock once and pushes the
whole batch through coder/hnsw's variadic Graph.Add. Pre-validation
stays in the handler so per-item error messages keep their item-index
precision.
Microbench (internal/vectord/batch_bench_test.go) at d=768 cosine:
N=16 SingleAdd 283µs/op → BatchAdd 170µs/op 1.66×
N=128 SingleAdd 7.9ms/op → BatchAdd 7.5ms/op 1.05×
N=1024 SingleAdd 87.5ms/op → BatchAdd 83.4ms/op 1.05×
Win is biggest at staffing-driver batch sizes (N=16) where
per-call lock + validation overhead is a meaningful fraction. At
larger N the inner HNSW neighborhood search per insert dominates,
which is the load-bearing finding for Option B (sharded indexes):
the throughput ceiling lives inside the library, not at the lock,
so sharding to N parallel Graphs is the only path to true
concurrent-Add throughput.
g1, g1p, g2 smokes all PASS post-change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds optional persistence to vectord (G1's HNSW vector search). Single-
file framed format per index — eliminates the torn-write class that
the 3-way convergent scrum finding identified:
_vectors/<name>.lhv1 — single binary blob:
[4 bytes magic "LHV1"]
[4 bytes envelope_len uint32 BE]
[envelope bytes — JSON params + metadata + version]
[graph bytes — raw hnsw.Graph.Export]
Pre-extraction: internal/catalogd/store_client.go → internal/storeclient/
shared package, since both catalogd and vectord need it. Same pattern as
the pre-D5 catalogclient extraction.
Optional via [vectord].storaged_url config (empty = ephemeral mode).
On startup: List + Load each persisted index. After Create / batch Add /
DELETE: Save (or Delete from storaged). Save failures are logged-not-
fatal — in-memory state is the source of truth in flight.
Acceptance smoke G1P 8/8 PASS — kill+restart preserves state, post-
restart search returns dist=0 (graph round-trips exactly), DELETE
removes the file, post-delete restart shows count=0.
All 8 smokes (D1-D6 + G1 + G1P) PASS deterministically. The g1_smoke
gained scripts/g1_smoke.toml that disables persistence so the
in-memory API test stays decoupled from any rehydrate-from-storaged
state contamination.
Cross-lineage scrum on shipped code:
- Opus 4.7 (opencode): 1 BLOCK + 5 WARN + 3 INFO
- Kimi K2-0905 (openrouter): 1 BLOCK + 2 WARN
- Qwen3-coder (openrouter): 2 BLOCK + 2 WARN + 1 INFO
Fixed (3 — 1 convergent + 2 single-reviewer):
C1 (Opus + Kimi + Qwen 3-WAY CONVERGENT WARN): Save was non-atomic
across two PUTs — envelope-succeeds + graph-fails left a half-
saved index that passed the "both present" List filter and
silently mismatched metadata against vectors on Load.
Fix: collapse to single framed file (no torn-write window
possible).
O-B1 (Opus BLOCK): isNotFound substring-matched "key not found"
against the wrapped error message — brittle, any 5xx body
containing that text would silently misclassify as missing.
Fix: errors.Is(err, storeclient.ErrKeyNotFound).
O-I3 (Opus INFO): handleAdd pre-validation only covered id+dim;
NaN/Inf/zero-norm could still fail mid-batch leaving partial
commits. Fix: extend pre-validation to call ValidateVector
(newly exported) per item before any commit.
Dismissed (3 false positives):
K-B1 + Q-B1 ("safeKey double-escapes %2F segments") — false
convergent. Wire-protocol escape is decoded by storaged's chi
router on the way in; on-disk key is the original literal.
%2F round-trips correctly through PathEscape → URL → chi decode
→ S3 key.
Q-B2 ("List vulnerable to race conditions") — vectord is single-
process; no concurrent Save against List in the same vectord.
Deferred (3): rehydrate per-index timeout (G2+ multi-index scale),
saveAfter request ctx (matches G0 timeout deferral), Encode RLock
during slow writer (documented as buffer-only API).
The C1 finding is the strongest signal of the cross-lineage filter:
three independent reviewers all flagged the same torn-write hazard.
Single-file framing eliminates the class — there's now no Persistor
state where envelope and graph can disagree.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First G1+ piece. Standalone vectord service with in-memory HNSW
indexes keyed by string IDs and optional opaque JSON metadata.
Wraps github.com/coder/hnsw v0.6.1 (pure Go, no cgo). New port
:3215 with /v1/vectors/* routed through gateway.
API:
POST /v1/vectors/index create
GET /v1/vectors/index list
GET /v1/vectors/index/{name} get info
DELETE /v1/vectors/index/{name}
POST /v1/vectors/index/{name}/add (batch)
POST /v1/vectors/index/{name}/search
Acceptance smoke 7/7 PASS — including recall=1 on inserted vector
w-042 (cosine distance 5.96e-8, float32 precision noise), 200-
vector batch round-trip, dim mismatch → 400, missing index → 404,
duplicate create → 409.
Two upstream library quirks worked around in the wrapper:
1. coder/hnsw.Add panics with "node not added" on re-adding an
existing key (length-invariant fires because internal
delete+re-add doesn't change Len). Pre-Delete fixes for n>1.
2. Delete of the LAST node leaves layers[0] non-empty but
entryless; next Add SIGSEGVs in Dims(). Workaround: when
re-adding to a 1-node graph, recreate the underlying graph
fresh via resetGraphLocked().
Cross-lineage scrum on shipped code:
- Opus 4.7 (opencode): 0 BLOCK + 4 WARN + 3 INFO
- Kimi K2-0905 (openrouter): 2 BLOCK + 2 WARN + 1 INFO
- Qwen3-coder (openrouter): "No BLOCKs" (4 tokens)
Fixed (4 real + 2 cleanup):
O-W1: Lookup returned the raw []float32 from coder/hnsw — caller
mutation would corrupt index. Now copies before return.
O-W3: NaN/Inf vectors poison HNSW (distance comparisons return
false for both < and >, breaking heap invariants). Zero-norm
under cosine produces NaN. Now validated at Add time.
K-B1: Re-adding with nil metadata silently cleared the existing
entry — JSON-omitted "metadata" field deserializes as nil,
making upsert non-idempotent. Now nil = "leave alone"; explicit
{} or Delete to clear.
O-W4: Batch Add with mid-batch failure left items 0..N-1
committed and item N rejected. Now pre-validates all IDs+dims
before any Add.
O-I1: jsonItoa hand-roll replaced with strconv.Itoa — no
measured allocation win.
O-I2: distanceFn re-resolved per Search → use stored i.g.Distance.
Dismissed (2 false positives):
K-B2 "MaxBytesReader applied after full read" — false, applied
BEFORE Decode in decodeJSON
K-W1 "Search distances under read lock might see invalidated
slices from concurrent Add" — false, RWMutex serializes
write-lock during Add against read-lock during Search
Deferred (3): HTTP server timeouts (consistent G0 punt),
Content-Type validation (internal service behind gateway), Lookup
dim assertion (in-memory state can't drift).
The K-B1 finding is worth pausing on: nil metadata on re-add is
the kind of API ergonomics bug only a code-reading reviewer
catches — smoke would never detect it because the smoke always
sends explicit metadata. Three lines changed in Add; the resulting
API matches what callers actually expect.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>