Code-review pass after D1 shipped, all three model lineages running in parallel against the actual Go source (not docs): Convergent findings (≥2 reviewers — high confidence): - C1 BLOCK · Run() errCh/select race could silently drop fast bind errors. Fixed: net.Listen() now runs synchronously before the goroutine; bind errors surface as Run()'s return value. - C2 BLOCK · scripts/d1_smoke.sh sleep 0.5 races bind on cold boxes. Fixed: replaced with poll_health() loop, 5s/svc budget, 50ms poll. - C3 WARN · LoadConfig silent fallback when file missing. Fixed: emits slog.Warn with path + hint when path given but file absent. Single-reviewer fixes: - S1 WARN · slog.SetDefault inside Run() mutated global state from a library function. Fixed: Run() no longer calls SetDefault. - S2 WARN · os.IsNotExist → errors.Is(err, fs.ErrNotExist) idiom. - S6 WARN · smoke double-curl collapsed to single curl -i parse. Second-pass Opus review on post-fix code caught one more: - head -1 on curl -i fragile against 1xx interim lines. Fixed: awk picks the last HTTP/* status line (robust to 100 Continue). Accepted with rationale (deferred or planned): - S3 secrets-in-lakehouse.toml: D2.3 SecretsProvider already planned - S4 5x cmd/*/main.go duplication: defer until D2 reveals real per-service config consumption - S5 /health log volume: defer post-G0, not on k8s yet - 2nd-pass theoreticals: clean-exit-no-Shutdown path doesn't trigger, defensive defer ln.Close() aspirational, etc. Verification: - go build ./cmd/... exit 0 - go vet ./... clean - ./scripts/d1_smoke.sh D1 acceptance gate: PASSED - 3-lineage code review · 14 findings · 7 fixed · 0 deferred · 5 accepted with rationale Total D1 review coverage across the phase: - 3 doc-review passes (Opus + Kimi + Qwen) — 13 findings, 10 fixed - 1 runtime smoke — 1 finding (port 3100 collision), fixed - 1 code-review parallel pass — 14 findings, 7 fixed - 1 code-review second pass (Opus) — 1 actionable, fixed - Cumulative: 29 findings · 19 fixed inline · 5 accepted · 5 deferred Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
25 KiB
Phase G0 Kickoff Plan
Goal: smallest end-to-end ingest+query path in Go. Upload a CSV
via POST /ingest, query it via POST /sql, get rows back. No
vector, no profile, no UI yet.
Estimated duration: 1 engineer-week (5 working days + gate day + cleanup). Plan is calibrated for solo work; cut by ~40% with two engineers in parallel on storaged/catalogd vs ingestd/queryd.
Cutoff for G0: the closing acceptance gate (Day 6) passes
end-to-end against workers_500k.csv as the test fixture.
Day 0 — One-time setup
Done by an operator with sudo on the dev box. ~15 minutes.
| # | Step | Verify |
|---|---|---|
| 0.1 | Install Go 1.23+: curl -L https://go.dev/dl/go1.23.linux-amd64.tar.gz | sudo tar -C /usr/local -xz |
go version shows 1.23+ |
| 0.2 | Add /usr/local/go/bin to PATH (in ~/.bashrc) |
new shell sees go |
| 0.3 | Install cgo toolchain: apt install build-essential |
gcc --version works |
| 0.4 | Clone repo: git clone https://git.agentview.dev/profit/golangLAKEHOUSE.git |
cd golangLAKEHOUSE && go version from inside |
| 0.5 | Bring up MinIO locally (or point at existing) | mc ls local/ lists buckets, or whatever the dev S3 is |
| 0.6 | Verify DuckDB cgo path with a real compile-and-run smoke (NOT go install pkg@latest — that requires a main package and would fail on the duckdb library before exercising cgo). Steps: mkdir /tmp/duckdb-smoke && cd /tmp/duckdb-smoke && go mod init smoke && go get github.com/duckdb/duckdb-go/v2 && cat > main.go <<<'package main; import (_ "github.com/duckdb/duckdb-go/v2"; "database/sql"); func main(){db,_:=sql.Open("duckdb","");db.Ping();db.Close()}' && go run main.go — proves cgo linker chain + static-linked duckdb-go-bindings work on this platform |
exits 0, no link errors |
Day 0 acceptance: go version shows 1.23+, gcc --version works,
MinIO reachable on localhost:9000, the cgo smoke install above
succeeded. (go mod tidy is intentionally NOT run here — no imports
yet; verification moves to D1.)
Day 1 — Skeleton + chi + /health × 5 binaries
Goal: five binaries build, each binds to its port, /health
returns {"status":"ok","service":"<name>"}.
| # | File | What |
|---|---|---|
| 1.1 | internal/shared/server.go |
chi router factory, slog setup, graceful shutdown via signal.NotifyContext |
| 1.2 | internal/shared/config.go |
TOML loader using pelletier/go-toml/v2, default + override pattern |
| 1.3 | cmd/gateway/main.go |
port 3110, /health |
| 1.4 | cmd/storaged/main.go |
port 3211, /health |
| 1.5 | cmd/catalogd/main.go |
port 3212, /health |
| 1.6 | cmd/ingestd/main.go |
port 3213, /health |
| 1.7 | cmd/queryd/main.go |
port 3214, /health |
Port shift note: Original SPEC said 3100/3201–3204; D1 runtime
caught a collision — the live Rust lakehouse owns 3100. Shifted Go
dev ports to 3110+ so both systems can run concurrently during the
migration. G5 cutover flips gateway back to 3100 when Rust retires.
| 1.8 | lakehouse.toml | bind addresses, log level — sample committed |
| 1.9 | Makefile | build, run-gateway, etc. — convenience |
| 1.10 | cmd/gateway/main.go adds STUB routes POST /v1/ingest and POST /v1/sql returning 501 Not Implemented with a header X-Lakehouse-Stub: g0. Real reverse-proxy wiring lands on Day 6, but the routes exist from D1 so D6 is just behavior change, not new endpoints. | curl -X POST :3100/v1/ingest returns 501 with the stub header |
Acceptance D1: go mod tidy populates go.sum cleanly; go build ./cmd/... exits 0; running each binary in a separate terminal,
curl :3100/health through :3204/health all return 200 OK with
the expected JSON; gateway's stub /v1/* routes return 501.
Dependencies pulled: go-chi/chi/v5, pelletier/go-toml/v2.
Day 2 — storaged: S3 GET/PUT/LIST
Goal: put a file, get it back, list it.
| # | File | What |
|---|---|---|
| 2.1 | internal/storaged/bucket.go |
aws-sdk-go-v2/service/s3 wrapper — Get, Put, List, Delete |
| 2.2 | internal/storaged/registry.go |
BucketRegistry skeleton (per Rust ADR-017) — single bucket only in G0; multi-bucket lands in G2 |
| 2.3 | internal/secrets/provider.go |
SecretsProvider interface + FileSecretsProvider reading /etc/lakehouse/secrets.toml |
| 2.4 | cmd/storaged/main.go |
wire routes — GET /storage/get/{key}, PUT /storage/put/{key}, GET /storage/list?prefix=.... Bind to 127.0.0.1:3201 only (G0 is dev-only, no auth). Apply http.MaxBytesReader with a 256 MiB per-request cap (reduced from 2 GiB per Qwen3-coder Q1 — 2 GiB × N concurrent = blow the box) + a buffered semaphore capping concurrent in-flight PUTs at 4. PUTs exceeding the cap → 413; PUTs blocked on the semaphore → 503 with Retry-After: 5 |
Acceptance D2: curl -T sample.csv 127.0.0.1:3201/storage/put/test/sample.csv
returns 200; curl 127.0.0.1:3201/storage/get/test/sample.csv echoes
the file bytes; curl 127.0.0.1:3201/storage/list?prefix=test/ lists
sample.csv; PUT exceeding 2 GiB returns 413 Payload Too Large.
Open question: error journal (Rust ADR-018 append-log pattern) — defer to G2 with multi-bucket federation, or wire it now? Plan says defer; revisit if errors surface during D3-D5.
Day 3 — catalogd: Parquet manifests
Goal: register a dataset, persist to storaged, restart, manifest still visible.
| # | File | What |
|---|---|---|
| 3.1 | internal/catalogd/manifest.go |
Parquet read/write using arrow-go/v18/parquet/pqarrow. Schema: dataset_id, name, schema_fingerprint, objects, created_at, updated_at, row_count |
| 3.2 | internal/catalogd/registry.go |
In-memory index (map[name]Manifest), rehydrated on startup from primary://_catalog/manifests/*.parquet |
| 3.3 | cmd/catalogd/main.go |
wire routes — POST /catalog/register (idempotent by name + fingerprint per Rust ADR-020), GET /catalog/manifest/{name}, GET /catalog/list |
| 3.4 | internal/catalogd/store_client.go |
thin HTTP client to cmd/storaged — round-trips manifest Parquets |
Acceptance D3: register a dataset, see it in /catalog/list,
restart catalogd, /catalog/list still shows it. Re-register same
name + same fingerprint → 200, same dataset_id. Different
fingerprint → 409 Conflict.
Day 4 — ingestd: CSV → Parquet → catalog
Goal: POST /ingest with a CSV file produces a Parquet in
storaged + a manifest in catalogd.
| # | File | What |
|---|---|---|
| 4.1 | internal/ingestd/schema.go |
infer Arrow schema from CSV header + first-N-row sampling. ADR-010 default-to-string on ambiguity |
| 4.2 | internal/ingestd/csv.go |
stream CSV → array.RecordBatch → Parquet (arrow-go pqarrow writer) |
| 4.3 | cmd/ingestd/main.go |
route POST /ingest — multipart form file → schema infer → write Parquet → call catalogd to register |
Acceptance D4: curl -F file=@workers_500k.csv :3203/ingest?name=workers_500k
returns 200 with the registered manifest; aws s3 ls (or mc ls)
shows the Parquet under primary://datasets/workers_500k/;
curl :3202/catalog/manifest/workers_500k returns the manifest with
row_count=500000.
Day 5 — queryd: DuckDB SELECT
Goal: SQL queries over Parquet datasets.
| # | File | What |
|---|---|---|
| 5.1 | internal/queryd/db.go |
database/sql connection to github.com/duckdb/duckdb-go/v2 (cgo). Ensures DuckDB extensions Parquet + httpfs are loaded; on connection open, executes CREATE SECRET (TYPE S3) populated from internal/secrets/provider.go so read_parquet('s3://...') against MinIO authenticates per session |
| 5.2 | internal/queryd/registrar.go |
reads catalogd /catalog/list, registers each dataset as a DuckDB view: CREATE VIEW workers_500k AS SELECT * FROM read_parquet('s3://...') |
| 5.3 | cmd/queryd/main.go |
route POST /sql (JSON body {"sql": "..."}). View refresh strategy: cache views with a TTL (default 30s) + invalidate on If-None-Match against catalogd's manifest etag. Don't re-CREATE on every request — Opus review flagged that as the perf cliff during D6 timing capture |
Acceptance D5: after Day 4 ingestion,
curl -X POST -d '{"sql":"SELECT count(*) FROM workers_500k"}' :3204/sql
returns [{"count_star()":500000}]. A SELECT role, count(*) FROM workers_500k WHERE state='IL' GROUP BY role returns expected rows.
Day 6 — Gate day: end-to-end via gateway
Goal: the closing G0 acceptance gate passes.
| # | What |
|---|---|
| 6.1 | Promote cmd/gateway/main.go /v1/ingest + /v1/sql from D1.10 stubs (501) to real reverse-proxies. httputil.NewSingleHostReverseProxy preserves the inbound path by default, so the proxy must use a custom Director (or Rewrite) that strips the /v1 prefix before forwarding — otherwise the call lands on ingestd:3203/v1/ingest which doesn't exist (ingestd serves /ingest, queryd serves /sql). Multipart forwarding for /v1/ingest is the riskiest hop — verify form parts pass through with the file body intact |
| 6.2 | Smoke script scripts/g0_smoke.sh: spin up MinIO + 5 services, ingest, query, assert row count |
| 6.3 | Run smoke against workers_500k.csv end-to-end |
| 6.4 | Capture timing — total ingest + query latency, file size, peak memory |
Closing G0 acceptance: scripts/g0_smoke.sh exits 0. Numbers
recorded in docs/G0_BASELINE.md for future regression comparison.
Day 7 — Cleanup + retro
| # | What |
|---|---|
| 7.1 | Update SPEC §4 G0 with what actually shipped vs planned (deviations, surprises) |
| 7.2 | Write docs/G0_BASELINE.md — measured perf numbers + comparison hooks for G1+ |
| 7.3 | Finalize ADRs that were stubbed before their decisions landed — ADR stubs go in at the start of D4 (arrow-go version pin, schema inference policy) and D5 (DuckDB extension load order, S3 secret provisioning, view-refresh TTL) so reviewers can object in-flight; D7 just commits them after running real code against the calls |
| 7.4 | Tag the commit phase-g0-complete |
| 7.5 | Open follow-up issues for anything punted (error journal, multi-bucket, profile system, two-phase-write orphan GC, shared-server.go refactor for cgo-handle services) |
Risks tracked across the week
| Risk | Where | Mitigation |
|---|---|---|
| cgo build fails on the dev box | D5 | D0.3 verifies gcc present; if cgo specifically breaks, fall back to DuckDB external process (SPEC §3.1 option B) |
| arrow-go pqarrow schema mismatch with CSV inference | D4 | Sample 1k rows for type inference, default to String per ADR-010, log when defaulting |
| DuckDB can't read S3 Parquet directly | D5 | Load httpfs extension explicitly; if it fails, copy Parquet to a local temp file before query (slow but correct) |
/catalog/register race between ingestd writer and catalogd reader |
D3-D4 | Same write-lock-across-storage-write pattern as Rust ADR-020 — serialize registers; throughput is OK at low ingest QPS |
workers_500k.csv schema drifts vs Rust era |
D4 | Plan calls for inferring fresh, not porting Rust schema. If staffer-domain features break in G3+, revisit |
Out of scope for G0 (deferred to later phases)
- Vector indexing — Phase G1
- Multi-bucket / federation — Phase G2
- Profile system — Phase G2
- Hot-swap atomicity — Phase G2
- Pathway memory — Phase G3
- Distillation pipeline — Phase G3
- MCP server / observer / auditor — Phase G4
- HTMX UI — Phase G5
- TLS, auth — explicit non-goal until G2 (single-bucket no-auth dev)
Open questions before Day 1
- MinIO instance — reuse the existing one at
localhost:9000that lakehouse uses (shared dev box) or stand up a fresh one with a separate bucket prefix? /etc/lakehouse/secrets.toml— share the lakehouse repo's secrets file or create/etc/golangLAKEHOUSE/secrets.toml?- Workers CSV source — derive from
workers_500k.parquet(round- trip back to CSV) or useworkers_500k_v9.csvif it exists?
These are ops calls, not architecture. Answer when D0 is being executed.
Self-review — independent pass via gateway overseer
Reviewer: opencode/claude-opus-4-7 via localhost:3100/v1/chat
(the same path the production overseer correction loop uses post-G0
in the Rust era). Run on the original draft before any of the inline
fixes above were applied. Findings dispositioned below.
BLOCK — both real, both fixed inline
| # | Finding | Disposition | Fix location |
|---|---|---|---|
| B1 | apt install build-essential alone won't satisfy the cgo link step for duckdb-go/v2 |
Fixed — D0.6 now runs a smoke go install against an empty module on D0 to flush platform issues here, not on D5 |
D0.6 |
| B2 | DuckDB session needs S3 credentials (CREATE SECRET) plumbed from SecretsProvider; "load httpfs" alone leaves auth unwired |
Fixed — D5.1 now calls CREATE SECRET (TYPE S3, ...) on connection open, populated from internal/secrets/provider.go |
D5.1 |
WARN — 4 of 5 fixed inline; 1 deferred
| # | Finding | Disposition | Fix location |
|---|---|---|---|
| W1 | Two-phase write (storaged → catalogd register) leaves orphan Parquets on partial failure; no GC story | Deferred — punted to G2 alongside multi-bucket + error journal; tracked in §Risks and D7.5 follow-up | D7.5 |
| W2 | "Refresh views on each /sql call" will be the D6 perf cliff |
Fixed — D5.3 now uses TTL-cached views with etag invalidation against catalogd | D5.3 |
| W3 | Shared internal/shared/server.go factory across heterogeneous binaries (HTTP ingress vs cgo-DB-holder) couples graceful-shutdown semantics that will need unwinding later |
Accepted with note — G0 keeps the simple shared factory; refactor explicitly listed as a G1+ follow-up | D7.5 |
| W4 | storaged PUT/GET on a TCP port with no auth + no body cap is a footgun | Fixed — D2.4 now binds 127.0.0.1 only and applies a 2 GiB MaxBytesReader cap |
D2.4 |
| W5 | Gateway reverse-proxy introduced cold on D6 gate day compresses risk into the deadline | Fixed — D1.10 now stubs the routes returning 501; D6.1 just promotes them to real proxies | D1.10 + D6.1 |
INFO — both fixed inline
| # | Finding | Disposition | Fix location |
|---|---|---|---|
| I1 | go mod tidy before any imports is a trivial-true verification |
Fixed — D0.6 re-purposed for the cgo smoke; tidy verification moved to D1 acceptance | D0.6 + D1 acceptance |
| I2 | Filing ADRs after the work is done inverts the usual pattern | Fixed — D7.3 reframed: ADR stubs go in at the start of D4/D5 so reviewers can object in-flight; D7.3 just finalizes them | D7.3 |
Net change (Opus pass)
7 of 9 findings produced inline plan edits; 2 deferred to post-G0 follow-up issues (W1 orphan GC, W3 shared-server refactor) with the deferral itself documented. No findings dismissed as confabulation.
Self-review — second pass via Kimi K2.6 (cross-lineage)
Reviewer: opencode/kimi-k2.6 via the same gateway path. Run on the
post-Opus-fix doc to surface what a different model lineage catches.
Per the Rust-era cross-lineage rotation pattern, Kimi tends to ground
on textual specifics where Opus surfaces architectural shape — both
lenses are useful.
Kimi's output was discursive (pre-format thinking rather than the requested BLOCK/WARN/INFO bullets), but two concrete catches landed — both BLOCKs that Opus missed and that the Opus-pass fixes themselves introduced.
BLOCK — both real, both fixed inline
| # | Finding | Disposition | Fix location |
|---|---|---|---|
| K1 | Opus's BLOCK B1 fix used go install github.com/duckdb/duckdb-go/v2@latest to verify cgo, but go install pkg@version requires a main package — duckdb-go/v2 is a library, so the command fails with "not a main package" before the cgo linker chain is exercised. The verification can pass on a broken-cgo box. |
Fixed — D0.6 now creates a temporary module + a 5-line main.go that imports duckdb-go/v2 and calls sql.Open("duckdb",""), then go runs it — actually compiles + executes with cgo |
D0.6 |
| K2 | Path mismatch: D1.10 stubs /v1/ingest and /v1/sql on the gateway, but D4 has ingestd serving /ingest and D5 has queryd serving /sql. D6.1's httputil.NewSingleHostReverseProxy preserves the inbound path by default, so the proxy would forward to ingestd:3203/v1/ingest which doesn't exist. Smoke would fail on D6 with a 404 on the backend, not the gateway. |
Fixed — D6.1 now specifies a custom Director that strips the /v1 prefix before forwarding |
D6.1 |
WARN — none extracted (Kimi response truncated at this section)
The remaining response stream considered D2.4 MaxBytesReader semantics
and D3 manifest registry concurrency, but cut off before producing
structured findings. A third pass could be run if more lineage
coverage is wanted — for now, two BLOCKs fixed is the cross-lineage
delta.
Net change (Kimi pass)
2 BLOCKs landed, both fixed inline. Both were introduced by Opus-pass fixes — illustrating exactly why cross-lineage rotation matters: one model's review of the original is not the same as a different model's review of the post-fix version. The Rust auditor's Kimi/Haiku/Opus rotation captures this dynamic; today's two-pass doc review reproduces it on a much smaller scale.
Self-review — third pass via Qwen3-coder:480b (cross-lineage)
Reviewer: qwen3-coder:480b via ollama_cloud (Ollama Pro). Largest
single-model open-weights coder in the fleet. Run on the post-Kimi-fix
doc.
Output style note
Qwen3-coder's response was 80% approval of prior fixes (citing B1/B2/K1/K2/I2 as "NEW BLOCKs/INFOs" — actually a misuse of the NEW label, but useful as triangulation: three independent lineages now agree the prior fixes are correct). Only 2 genuinely new WARNs.
WARN — 1 fixed inline, 1 deferred
| # | Finding | Disposition | Fix location |
|---|---|---|---|
| Q1 | D2.4 MaxBytesReader at 2 GiB still permits memory exhaustion under concurrent uploads — N concurrent × 2 GiB blows the box |
Fixed — D2.4 cap reduced to 256 MiB per request, plus a 4-slot semaphore on concurrent in-flight PUTs (503 + Retry-After when full) | D2.4 |
| Q2 | D5.3 view refresh has TTL + etag invalidation but no batching, so bursty queries against many datasets can re-issue catalog reads redundantly | Deferred — minor under G0's single-tenant load; revisit in G2 alongside the multi-bucket / profile work that creates more catalog churn | D7.5 |
Net change (Qwen pass)
1 fixed inline, 1 deferred. The mislabeled "approval" output was net positive — it's the closest thing to a triangulation signal we get without a fourth model. With Opus + Kimi + Qwen all confirming the prior fixes hold, the plan is unusually well-cross-checked for a greenfield kickoff.
Cumulative disposition across all 3 lineages
| Pass | Reviewer | New findings | Fixed | Deferred |
|---|---|---|---|---|
| 1 | opencode/claude-opus-4-7 |
9 | 7 | 2 |
| 2 | opencode/kimi-k2.6 |
2 | 2 | 0 |
| 3 | qwen3-coder:480b (ollama_cloud) |
2 | 1 | 1 |
| 4 | runtime smoke (D1 actual launch) | 1 | 1 | 0 |
| Total | 14 | 11 | 3 |
The 4th pass — runtime smoke — caught the only thing three doc-review lineages couldn't: port 3100 was already occupied by the live Rust lakehouse. Documented and dispositioned same pattern as the model findings.
Three deferrals, all to G2: orphan GC on two-phase write (W1), shared-server refactor for cgo-handle services (W3), batched view refresh (Q2). Tracked in D7.5.
D1 — actual run results (2026-04-28)
Phase G0 Day 1 executed end-to-end. Output of scripts/d1_smoke.sh:
[d1-smoke] building...
[d1-smoke] launching...
[d1-smoke] /health probes:
✓ gateway (:3110) → {"status":"ok","service":"gateway"}
✓ storaged (:3211) → {"status":"ok","service":"storaged"}
✓ catalogd (:3212) → {"status":"ok","service":"catalogd"}
✓ ingestd (:3213) → {"status":"ok","service":"ingestd"}
✓ queryd (:3214) → {"status":"ok","service":"queryd"}
[d1-smoke] gateway 501 stub probes:
✓ POST /v1/ingest → 501 + X-Lakehouse-Stub: g0
✓ POST /v1/sql → 501 + X-Lakehouse-Stub: g0
[d1-smoke] D1 acceptance gate: PASSED
What landed:
internal/shared/server.go— chi factory, slog JSON logging,/health, graceful shutdown viasignal.NotifyContextinternal/shared/config.go— TOML loader withDefaultConfig()and env-override-via-flag pattern (-configflag)cmd/{gateway,storaged,catalogd,ingestd,queryd}/main.go— five binaries, each ~30 lines, all using the shared factorylakehouse.toml— G0 dev config with the shifted 3110+ portsscripts/d1_smoke.sh— repeatable smoke test, exits 0 on PASSbin/{gateway,storaged,catalogd,ingestd,queryd}— built binaries (~9.7 MB each, no debug stripping)
What G0 didn't need but I added anyway (intentional):
- Gateway already has the D1.10 stub routes wired; D6.1 is just swap-the-handler.
- TOML config supports an S3 section even though storaged doesn't consume it until D2 — saves a config schema bump on D2.
Next: D2 — storaged's actual S3 GET/PUT/LIST routes against MinIO.
D1 — code scrum review (3-lineage parallel pass)
After D1 shipped, the actual code (not docs) was reviewed by all three lineages running in parallel — the cross-lineage discipline applied to the implementation, not just the plan.
| Pass | Reviewer | Latency | Findings |
|---|---|---|---|
| 1 | opencode/claude-opus-4-7 |
24.9s | 2 BLOCK + 4 WARN + 2 INFO = 8 |
| 2 | opencode/kimi-k2.6 |
18.3s | discursive output; 1 effective WARN extracted |
| 3 | qwen3-coder:480b (ollama_cloud) |
62.0s | 2 BLOCK + 2 WARN + 1 INFO = 5 |
Convergent findings (caught by ≥2 reviewers — high confidence)
| # | Severity | Finding | Reviewers | Disposition |
|---|---|---|---|---|
| C1 | BLOCK | Run() errCh/select race: fast bind error can be silently dropped |
Opus + Qwen | Fixed — net.Listen() is now called synchronously before the goroutine; bind errors return as Run's return value before any select |
| C2 | BLOCK | sleep 0.5 in d1_smoke.sh races bind on cold-start boxes |
Opus + Qwen | Fixed — replaced with poll_health() loop, 5s budget per service, 50ms poll interval |
| C3 | WARN | LoadConfig silent fallback when file missing |
Opus + Qwen | Fixed — emits slog.Warn with path + hint when path was given but file absent |
Single-reviewer findings (lineage-specific catches)
| # | Severity | Finding | Reviewer | Disposition |
|---|---|---|---|---|
| S1 | WARN | slog.SetDefault mutates global state from a library function |
Kimi | Fixed — Run() no longer calls SetDefault; logger is constructed locally and passed to middleware |
| S2 | WARN | os.IsNotExist → errors.Is(err, fs.ErrNotExist) idiom |
Opus | Fixed |
| S3 | WARN | secrets in lakehouse.toml — no .gitignore / env path |
Opus | Already planned — D2.3 introduces SecretsProvider; D1 fields are empty strings |
| S4 | INFO | 5x near-identical cmd/*/main.go — could share a Main() helper |
Opus | Accepted — defer until D2 surfaces the real per-service config consumption pattern |
| S5 | INFO | /health logs at Info — k8s liveness probes will dominate volume |
Opus | Accepted — defer post-G0; G0 isn't on k8s yet |
| S6 | WARN | smoke double-curl per route doubles load + creates state window | Opus | Fixed — single curl -i parsed once for both code + headers |
Second-pass review (post-fix code, Opus only)
After applying the first-round fixes, a second-pass Opus review
caught 1 self-downgraded BLOCK + 4 WARN + 2 INFO. Of those, only one
was a real new actionable: head -1 on curl -i is fragile against
1xx interim lines. Fixed — switched to awk '/^HTTP\//{code=$2} END{print code}'
which picks the last status line (robust to 100 Continue etc).
The other 5 second-pass findings were:
- Theoretical (clean-exit-without-Shutdown path that doesn't trigger today)
- Aspirational (defensive
defer ln.Close()for hypothetical future code) - Caught downstream (
poll_healthnot identity-checking — but the followup probe IS identity-checked) - Stylistic (
config.go'sslog.Warnlands on the global default — but config loads BEFORE the service-level logger exists; accepting) - Aspirational (
newListener"for testability" indirection isn't needed yet)
Cumulative D1 disposition
- Pass 1 (3 lineages parallel): 14 distinct findings (3 convergent + 11 single-reviewer; some overlap with prior doc-review findings)
- Pass 2 (Opus second-pass on post-fix code): 7 findings, 1 actionable fixed, 6 dispositioned
- Total fixed in D1 code review: 7 · accepted-with-rationale: 5 · deferred: 0
- Build clean (
go build ./cmd/...exit 0),go vet ./...clean, smoke PASS after every fix round.
D1 ships harder than it would have without the scrum: bind-error handling is now race-free, smoke is deterministic, log volume on /health is acknowledged, secrets handling has a flagged path forward.