Closes the documented 500K-test gap (memory project_golang_lakehouse:
"storaged 256 MiB PUT cap blocks single-file LHV1 persistence above
~150K vectors at d=768"). Vectord persistence under "_vectors/" now
gets a 4 GiB cap; everything else (parquets, manifests, ingest)
keeps the 256 MiB default.
Why per-prefix and not "raise globally":
- 256 MiB cap is a real DoS protection — runaway clients can't
drain the daemon. Raising it for ALL traffic would expand the
attack surface for routine paths that have no need.
- Per-prefix preserves existing protection while opening the one
documented production-scale path.
Why not split LHV1 across multiple keys (the alternative):
- G1P shipped a single-Put framed format SPECIFICALLY to eliminate
the torn-write class (memory: "Single Put eliminates the torn-
write class that the 3-way convergent scrum finding identified").
- Multi-key LHV1 would re-introduce the half-saved-state failure
mode we just paid to fix. Streaming via existing manager.Uploader
is the better architectural answer.
Why not bump the cap operationally via env/config:
- Future operator-driven cap can drop in cleanly via the
maxPutBytesFor function. Started with hardcoded 4 GiB to keep
this commit small; config knob is a follow-up if production
workloads diverge from the documented 500K-vector ceiling.
manager.Uploader is already streaming-multipart on the outbound
S3 side; the inbound MaxBytesReader cap is a safety gate, not a
memory bottleneck. So raising it for vectord just lets the
existing streaming path actually flow, without introducing new
memory pressure (4-slot semaphore × 4 GiB worst case = 16 GiB
only if all slots simultaneously max out — vanishingly unlikely).
Implementation:
cmd/storaged/main.go:
new constant maxPutBytesVectors = 4 GiB (covers >700K vectors @ d=768)
new constant vectorsPrefix = "_vectors/" (synced with vectord.VectorPrefix)
new function maxPutBytesFor(key) → cap-by-prefix
handlePut: ContentLength check + MaxBytesReader use the per-key cap
cmd/storaged/main_test.go (3 new test funcs):
TestMaxPutBytesFor: 7 cases incl. nested prefix, substring-but-not-
prefix, empty key, parquet/manifest paths.
TestVectorPrefixSyncWithVectord: regression test that asserts
vectorsPrefix == vectord.VectorPrefix. A future rename surfaces
here instead of silently bypassing the larger cap.
TestVectorCapAccommodates500KStaffingTest: bounds the cap above
the documented production workload (~700 MiB conservative).
Verified:
go test ./cmd/storaged/ — all green (was 1 func, now 4)
just verify — 9 smokes still pass · 32s wall
just proof contract — 53/0/1 unchanged
Out of scope for this commit (deserves its own):
- Heavy integration smoke: 200K dim=768 synthetic vectors → ~700
MiB LHV1 → kill+restart vectord → recall=1. ~5-10 min wall;
follow-up if you want production-scale persistence verified
end-to-end. Unit tests + existing g1p_smoke cover the wiring.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase G0 Day 2 ships storaged: aws-sdk-go-v2 wrapper + chi routes
binding 127.0.0.1:3211 with 256 MiB MaxBytesReader, Content-Length
up-front 413, and a 4-slot non-blocking semaphore returning 503 +
Retry-After:5 when full. Acceptance smoke (6/6 probes) PASSES against
the dedicated MinIO bucket lakehouse-go-primary, isolated from the
Rust system's lakehouse bucket during coexistence.
Cross-lineage scrum on the shipped code:
- Opus 4.7 (opencode): 1 BLOCK + 3 WARN + 3 INFO
- Qwen3-coder (openrouter): 2 BLOCK + 1 WARN + 1 INFO (3 false positives)
- Kimi K2-0905 (openrouter, after route-shopping past opencode's 4k
cap and the direct adapter's empty-content reasoning bug):
1 BLOCK + 2 WARN + 1 INFO
Fixed:
C1 buildRegistry ctx cancel footgun → context.Background()
(Opus + Kimi convergent; future credential refresh chains)
C2 MaxBytesReader unwrap through manager.Uploader multipart
goroutines → Content-Length up-front 413 + string-suffix fallback
(Opus + Kimi convergent; latent 500-instead-of-413 in 5-256 MiB range)
C3 Bucket.List unbounded accumulation → MaxListResults=10_000 cap
(Opus + Kimi convergent; OOM guard)
S1 PUT response Content-Type: application/json (Opus single-reviewer)
Strict validateKey policy (J approved): rejects empty, >1024B, NUL,
leading "/", ".." path components, CR/LF/tab control characters.
DELETE exposed at HTTP layer (J approved option A) for symmetry +
smoke ergonomics.
Build clean, vet clean, all unit tests pass, smoke 6/6 PASS after
every fix round. go.mod 1.23 → 1.24 (required by aws-sdk-go-v2).
Process finding worth recording: opencode caps non-streaming Kimi at
max_tokens=4096; the direct kimi.com adapter consumed 8192 tokens of
reasoning but surfaced empty content; openrouter/moonshotai/kimi-k2-0905
delivered structured output in ~33s. Future Kimi scrums should default
to that route.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>