2 Commits

Author SHA1 Message Date
root
423a3817c5 D: storaged per-prefix PUT cap — vectord _vectors/ → 4 GiB
Closes the documented 500K-test gap (memory project_golang_lakehouse:
"storaged 256 MiB PUT cap blocks single-file LHV1 persistence above
~150K vectors at d=768"). Vectord persistence under "_vectors/" now
gets a 4 GiB cap; everything else (parquets, manifests, ingest)
keeps the 256 MiB default.

Why per-prefix and not "raise globally":
  - 256 MiB cap is a real DoS protection — runaway clients can't
    drain the daemon. Raising it for ALL traffic would expand the
    attack surface for routine paths that have no need.
  - Per-prefix preserves existing protection while opening the one
    documented production-scale path.

Why not split LHV1 across multiple keys (the alternative):
  - G1P shipped a single-Put framed format SPECIFICALLY to eliminate
    the torn-write class (memory: "Single Put eliminates the torn-
    write class that the 3-way convergent scrum finding identified").
  - Multi-key LHV1 would re-introduce the half-saved-state failure
    mode we just paid to fix. Streaming via existing manager.Uploader
    is the better architectural answer.

Why not bump the cap operationally via env/config:
  - Future operator-driven cap can drop in cleanly via the
    maxPutBytesFor function. Started with hardcoded 4 GiB to keep
    this commit small; config knob is a follow-up if production
    workloads diverge from the documented 500K-vector ceiling.

manager.Uploader is already streaming-multipart on the outbound
S3 side; the inbound MaxBytesReader cap is a safety gate, not a
memory bottleneck. So raising it for vectord just lets the
existing streaming path actually flow, without introducing new
memory pressure (4-slot semaphore × 4 GiB worst case = 16 GiB
only if all slots simultaneously max out — vanishingly unlikely).

Implementation:
  cmd/storaged/main.go:
    new constant maxPutBytesVectors = 4 GiB (covers >700K vectors @ d=768)
    new constant vectorsPrefix = "_vectors/" (synced with vectord.VectorPrefix)
    new function maxPutBytesFor(key) → cap-by-prefix
    handlePut: ContentLength check + MaxBytesReader use the per-key cap

  cmd/storaged/main_test.go (3 new test funcs):
    TestMaxPutBytesFor: 7 cases incl. nested prefix, substring-but-not-
      prefix, empty key, parquet/manifest paths.
    TestVectorPrefixSyncWithVectord: regression test that asserts
      vectorsPrefix == vectord.VectorPrefix. A future rename surfaces
      here instead of silently bypassing the larger cap.
    TestVectorCapAccommodates500KStaffingTest: bounds the cap above
      the documented production workload (~700 MiB conservative).

Verified:
  go test ./cmd/storaged/ — all green (was 1 func, now 4)
  just verify             — 9 smokes still pass · 32s wall
  just proof contract     — 53/0/1 unchanged

Out of scope for this commit (deserves its own):
  - Heavy integration smoke: 200K dim=768 synthetic vectors → ~700
    MiB LHV1 → kill+restart vectord → recall=1. ~5-10 min wall;
    follow-up if you want production-scale persistence verified
    end-to-end. Unit tests + existing g1p_smoke cover the wiring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 06:00:09 -05:00
root
8cfcdb8e5f G0 D2: storaged S3 GET/PUT/LIST/DELETE · 3-lineage scrum · 4 fixes applied
Phase G0 Day 2 ships storaged: aws-sdk-go-v2 wrapper + chi routes
binding 127.0.0.1:3211 with 256 MiB MaxBytesReader, Content-Length
up-front 413, and a 4-slot non-blocking semaphore returning 503 +
Retry-After:5 when full. Acceptance smoke (6/6 probes) PASSES against
the dedicated MinIO bucket lakehouse-go-primary, isolated from the
Rust system's lakehouse bucket during coexistence.

Cross-lineage scrum on the shipped code:
  - Opus 4.7 (opencode): 1 BLOCK + 3 WARN + 3 INFO
  - Qwen3-coder (openrouter): 2 BLOCK + 1 WARN + 1 INFO (3 false positives)
  - Kimi K2-0905 (openrouter, after route-shopping past opencode's 4k
    cap and the direct adapter's empty-content reasoning bug):
    1 BLOCK + 2 WARN + 1 INFO

Fixed:
  C1 buildRegistry ctx cancel footgun → context.Background()
     (Opus + Kimi convergent; future credential refresh chains)
  C2 MaxBytesReader unwrap through manager.Uploader multipart
     goroutines → Content-Length up-front 413 + string-suffix fallback
     (Opus + Kimi convergent; latent 500-instead-of-413 in 5-256 MiB range)
  C3 Bucket.List unbounded accumulation → MaxListResults=10_000 cap
     (Opus + Kimi convergent; OOM guard)
  S1 PUT response Content-Type: application/json (Opus single-reviewer)

Strict validateKey policy (J approved): rejects empty, >1024B, NUL,
leading "/", ".." path components, CR/LF/tab control characters.
DELETE exposed at HTTP layer (J approved option A) for symmetry +
smoke ergonomics.

Build clean, vet clean, all unit tests pass, smoke 6/6 PASS after
every fix round. go.mod 1.23 → 1.24 (required by aws-sdk-go-v2).

Process finding worth recording: opencode caps non-streaming Kimi at
max_tokens=4096; the direct kimi.com adapter consumed 8192 tokens of
reasoning but surfaced empty content; openrouter/moonshotai/kimi-k2-0905
delivered structured output in ~33s. Future Kimi scrums should default
to that route.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 23:23:03 -05:00