golangLAKEHOUSE

profit/golangLAKEHOUSE

Fork 0

Commit Graph

Author	SHA1	Message	Date
root	0f79bce948	Batch 3: cmd/<bin>/main_test.go × 6 — closes R-005 Adds main_test.go for each of the 6 cmd binaries that lacked them (storaged already had main_test.go; that's where the pattern came from). Each test file focuses on the cmd-specific surface — route mounts, body caps, decode/validation paths — without re-testing internal package logic that's covered elsewhere. cmd/catalogd/main_test.go — 6 funcs TestRoutesMounted: chi.Walk asserts /catalog/{register,manifest/*,list} TestHandleRegister_BodyTooLarge: 5 MiB body → 4xx TestHandleRegister_MalformedJSON: 400 TestHandleRegister_EmptyName_400: ErrEmptyName surfaces as 400 TestHandleGetManifest_404 + TestHandleList_EmptyShape cmd/embedd/main_test.go — 8 funcs stubProvider implements embed.Provider deterministically TestRoutesMounted, MalformedJSON_400, EmptyTextRejected_400 (per scrum O-W3), UpstreamError_502 (provider error → 502, not 500), HappyPath_ProviderEcho, BodyTooLarge (4xx range), TestItoa (covers the no-strconv helper) cmd/gateway/main_test.go — 4 funcs TestMustParseUpstream_HappyPaths: 3 valid URLs TestMustParseUpstream_FailureExits: re-execs the test binary in a subprocess with env flag (standard pattern for testing os.Exit callers); subprocess invokes mustParseUpstream("127.0.0.1:3211") [missing scheme]; expects exit non-zero. Same pattern for garbage. TestUpstreamConfigKeys_DocumentedShape: locks the 6 _url keys cmd/ingestd/main_test.go — 7 funcs Stubs both storaged and catalogd via httptest.Server so the cmd layer can be exercised without bringing the full chain up. TestHandleIngest_MissingNameQueryParam: 400 with "name" in body TestHandleIngest_MalformedMultipart: 400 TestHandleIngest_MissingFormFile: 400 (valid multipart, wrong field) TestHandleIngest_BodyTooLarge: 4xx TestEscapeKeyPath: 6-case URL-escape table (apostrophe, space, etc.) TestParquetKeyPath_Format: locks the datasets/<n>/<fp>.parquet shape per scrum C-DRIFT (any rename breaks idempotent re-ingest) cmd/queryd/main_test.go — 6 funcs Tests pre-DB paths (decode, body cap, empty SQL); db.QueryContext itself needs DuckDB so it's covered by GOLAKE-040 in the proof harness, not unit tests. handlers.db = nil here is intentional. TestHandleSQL_EmptySQL_400: 3 cases (empty, whitespace, mixed-WS) TestMaxSQLBodyBytes_Reasonable: locks the 64 KiB constant in a sane range so a refactor can't blow it open TestPrimaryBucket_Constant: locks "primary" — secrets lookup uses this; rename = silent secret-resolution failure at boot cmd/vectord/main_test.go — 14 funcs All 6 routes verified mounted. handlers.persist = nil = pure in-memory mode; persistence is GOLAKE-070 in the proof harness. Coverage of every error branch in handleCreate/Add/Search/Delete: missing index → 404, dim mismatch → 400, empty items → 400, empty id → 400, malformed JSON → 400, body too large → 4xx, happy create → 201, happy list → 200. One real finding caught during writing: Body-cap rejection is sometimes 413 (typed MaxBytesError survives unwrap) and sometimes 400 (decoder wraps it as a generic decode error). Both are valid client-error contracts; the contract isn't "exactly 413" but "fails loud as 4xx, never silent 200 or 5xx." Tests assert 4xx range. The proof harness's proof_assert_status_4xx already had this shape — just bringing the unit tests in line with it. Verified: go test -count=1 -short ./cmd/... — all 7 packages green just verify — vet + test + 9 smokes 35s Closes audit risk R-005 (6/7 cmd/main.go untested). Combined with the proof harness's wiring coverage, every cmd-level handler now has both unit-test and integration-test coverage of the wiring layer. R-005 → CLOSED. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 06:18:46 -05:00
root	9ee7fc5550	G2: embedd — text → vector via Ollama · 2 scrum fixes Bridges the missing piece for the staffing co-pilot: text inputs to vectord-shaped vectors. Standalone cmd/embedd on :3216 fronted by gateway at /v1/embed. Pluggable embed.Provider interface (G2 ships Ollama; OpenAI/Voyage swap in via the same interface in G3+). Wire format: POST /v1/embed {"texts":[...], "model":"..."} // model optional → 200 {"model","dimension","vectors":[[...]]} Default model: nomic-embed-text (768-d). Ollama returns float64; provider converts to float32 at the boundary so vectors flow through vectord/HNSW without re-conversion. Acceptance smoke 5/5 PASS — including the architectural payoff: end-to-end embed → vectord add → search by re-embedded text returns recall=1 at distance 5.96e-8 (float32 precision noise on identical unit vectors). The staffing co-pilot pipeline (text → vector → similarity search) is now functional end-to-end. All 9 smokes (D1-D6 + G1 + G1P + G2) PASS deterministically. Cross-lineage scrum on shipped code: - Opus 4.7 (opencode): 0 BLOCK + 4 WARN + 3 INFO - Kimi K2-0905 (openrouter): 0 BLOCK + 2 WARN + 1 INFO - Qwen3-coder (openrouter): "No BLOCKs" (3 tokens) Fixed (2 — 1 convergent + 1 single-reviewer): C1 (Opus + Kimi convergent WARN): per-text 60s timeout × N-text batch was up to N×60s with no batch-level cap. One stuck Ollama call would stall the whole handler indefinitely. Fix: context.WithTimeout(r.Context(), 60s) wraps the entire batch. O-W3 (Opus WARN): empty strings in texts went to Ollama unchecked, producing version-dependent garbage. Fix: reject "" with 400 at the handler boundary so callers get a deterministic answer instead of an upstream-conditional 502. Deferred (4): drainAndClose 64KiB cap (matches G0 pattern), no concurrency limit on /embed (single-tenant G2), missing Accept header (exotic-proxy concern), MaxBytesError string-match redundancy (paranoia layer kept consistent across codebase). Zero false positives this round — Qwen returned 3 tokens "No BLOCKs" and the other two reviewers' findings were all real. Setup confirmed: Ollama 0.21.0 on :11434 with nomic-embed-text loaded. Per-text /api/embeddings used (forward-compat with 0.21+); newer 0.4+ /api/embed batch endpoint can swap in via the Provider interface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 01:42:27 -05:00

Author

SHA1

Message

Date

root

0f79bce948

Batch 3: cmd/<bin>/main_test.go × 6 — closes R-005

Adds main_test.go for each of the 6 cmd binaries that lacked them
(storaged already had main_test.go; that's where the pattern came
from). Each test file focuses on the cmd-specific surface — route
mounts, body caps, decode/validation paths — without re-testing
internal package logic that's covered elsewhere.

cmd/catalogd/main_test.go — 6 funcs
  TestRoutesMounted: chi.Walk asserts /catalog/{register,manifest/*,list}
  TestHandleRegister_BodyTooLarge: 5 MiB body → 4xx
  TestHandleRegister_MalformedJSON: 400
  TestHandleRegister_EmptyName_400: ErrEmptyName surfaces as 400
  TestHandleGetManifest_404 + TestHandleList_EmptyShape

cmd/embedd/main_test.go — 8 funcs
  stubProvider implements embed.Provider deterministically
  TestRoutesMounted, MalformedJSON_400, EmptyTextRejected_400 (per
    scrum O-W3), UpstreamError_502 (provider error → 502, not 500),
    HappyPath_ProviderEcho, BodyTooLarge (4xx range), TestItoa
    (covers the no-strconv helper)

cmd/gateway/main_test.go — 4 funcs
  TestMustParseUpstream_HappyPaths: 3 valid URLs
  TestMustParseUpstream_FailureExits: re-execs the test binary in a
    subprocess with env flag (standard pattern for testing os.Exit
    callers); subprocess invokes mustParseUpstream("127.0.0.1:3211")
    [missing scheme]; expects exit non-zero. Same pattern for garbage.
  TestUpstreamConfigKeys_DocumentedShape: locks the 6 _url keys

cmd/ingestd/main_test.go — 7 funcs
  Stubs both storaged and catalogd via httptest.Server so the cmd
  layer can be exercised without bringing the full chain up.
  TestHandleIngest_MissingNameQueryParam: 400 with "name" in body
  TestHandleIngest_MalformedMultipart: 400
  TestHandleIngest_MissingFormFile: 400 (valid multipart, wrong field)
  TestHandleIngest_BodyTooLarge: 4xx
  TestEscapeKeyPath: 6-case URL-escape table (apostrophe, space, etc.)
  TestParquetKeyPath_Format: locks the datasets/<n>/<fp>.parquet shape
    per scrum C-DRIFT (any rename breaks idempotent re-ingest)

cmd/queryd/main_test.go — 6 funcs
  Tests pre-DB paths (decode, body cap, empty SQL); db.QueryContext
  itself needs DuckDB so it's covered by GOLAKE-040 in the proof
  harness, not unit tests. handlers.db = nil here is intentional.
  TestHandleSQL_EmptySQL_400: 3 cases (empty, whitespace, mixed-WS)
  TestMaxSQLBodyBytes_Reasonable: locks the 64 KiB constant in a
    sane range so a refactor can't blow it open
  TestPrimaryBucket_Constant: locks "primary" — secrets lookup uses
    this; rename = silent secret-resolution failure at boot

cmd/vectord/main_test.go — 14 funcs
  All 6 routes verified mounted. handlers.persist = nil = pure
  in-memory mode; persistence is GOLAKE-070 in the proof harness.
  Coverage of every error branch in handleCreate/Add/Search/Delete:
    missing index → 404, dim mismatch → 400, empty items → 400,
    empty id → 400, malformed JSON → 400, body too large → 4xx,
    happy create → 201, happy list → 200.

One real finding caught during writing:
  Body-cap rejection is sometimes 413 (typed MaxBytesError survives
  unwrap) and sometimes 400 (decoder wraps it as a generic decode
  error). Both are valid client-error contracts; the contract isn't
  "exactly 413" but "fails loud as 4xx, never silent 200 or 5xx."
  Tests assert 4xx range. The proof harness's
  proof_assert_status_4xx already had this shape — just bringing
  the unit tests in line with it.

Verified:
  go test -count=1 -short ./cmd/...  — all 7 packages green
  just verify                         — vet + test + 9 smokes 35s

Closes audit risk R-005 (6/7 cmd/main.go untested). Combined with
the proof harness's wiring coverage, every cmd-level handler now
has both unit-test and integration-test coverage of the wiring
layer. R-005 → CLOSED.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-29 06:18:46 -05:00

root

9ee7fc5550

G2: embedd — text → vector via Ollama · 2 scrum fixes

Bridges the missing piece for the staffing co-pilot: text inputs to
vectord-shaped vectors. Standalone cmd/embedd on :3216 fronted by
gateway at /v1/embed. Pluggable embed.Provider interface (G2 ships
Ollama; OpenAI/Voyage swap in via the same interface in G3+).

Wire format:
  POST /v1/embed {"texts":[...], "model":"..."}  // model optional
  → 200 {"model","dimension","vectors":[[...]]}

Default model: nomic-embed-text (768-d). Ollama returns float64;
provider converts to float32 at the boundary so vectors flow through
vectord/HNSW without re-conversion.

Acceptance smoke 5/5 PASS — including the architectural payoff:
end-to-end embed → vectord add → search by re-embedded text returns
recall=1 at distance 5.96e-8 (float32 precision noise on identical
unit vectors). The staffing co-pilot pipeline (text → vector →
similarity search) is now functional end-to-end.

All 9 smokes (D1-D6 + G1 + G1P + G2) PASS deterministically.

Cross-lineage scrum on shipped code:
  - Opus 4.7 (opencode):                    0 BLOCK + 4 WARN + 3 INFO
  - Kimi K2-0905 (openrouter):              0 BLOCK + 2 WARN + 1 INFO
  - Qwen3-coder (openrouter):               "No BLOCKs" (3 tokens)

Fixed (2 — 1 convergent + 1 single-reviewer):
  C1 (Opus + Kimi convergent WARN): per-text 60s timeout × N-text
    batch was up to N×60s with no batch-level cap. One stuck Ollama
    call would stall the whole handler indefinitely. Fix:
    context.WithTimeout(r.Context(), 60s) wraps the entire batch.
  O-W3 (Opus WARN): empty strings in texts went to Ollama unchecked,
    producing version-dependent garbage. Fix: reject "" with 400 at
    the handler boundary so callers get a deterministic answer
    instead of an upstream-conditional 502.

Deferred (4): drainAndClose 64KiB cap (matches G0 pattern), no
concurrency limit on /embed (single-tenant G2), missing Accept
header (exotic-proxy concern), MaxBytesError string-match
redundancy (paranoia layer kept consistent across codebase).

Zero false positives this round — Qwen returned 3 tokens "No BLOCKs"
and the other two reviewers' findings were all real.

Setup confirmed: Ollama 0.21.0 on :11434 with nomic-embed-text loaded.
Per-text /api/embeddings used (forward-compat with 0.21+); newer
0.4+ /api/embed batch endpoint can swap in via the Provider interface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-29 01:42:27 -05:00

2 Commits