golangLAKEHOUSE/docs/PHASE_G0_KICKOFF.md
Claw 1142f54f23 G0 D1 ships: skeleton + chi + /health × 5 binaries · acceptance gate PASSED
Phase G0 Day 1 executed end-to-end after a third-pass review by
qwen3-coder:480b consolidated all findings across Opus/Kimi/Qwen
lineages.

Cross-lineage review consolidation (3 model passes + 1 runtime pass):
- Opus 4.7: 9 findings · 7 fixed inline · 2 deferred
- Kimi K2.6: 2 BLOCKs (introduced by Opus fixes) · 2 fixed
- Qwen3-coder:480b: 2 WARNs · 1 fixed (D2.4 256 MiB cap + 4-slot
  semaphore on PUTs) · 1 deferred (Q2 view refresh batching)
- Runtime smoke: 1 finding (port 3100 collision with live Rust
  lakehouse) · fixed (Go dev ports shifted to 3110+)
- Total: 14 findings · 11 fixed · 3 deferred to G2

What landed in code:
- internal/shared/server.go — chi factory, slog JSON, /health,
  graceful shutdown via signal.NotifyContext
- internal/shared/config.go — TOML loader, DefaultConfig, -config flag
- cmd/{gateway,storaged,catalogd,ingestd,queryd}/main.go — five
  binaries, each ~30 lines using the shared factory
- lakehouse.toml — G0 dev defaults (3110-3214)
- scripts/d1_smoke.sh — repeatable smoke that exits 0 on PASS
- go.mod / go.sum — chi v5.2.5, pelletier/go-toml/v2 v2.3.0

Verified end-to-end via scripts/d1_smoke.sh:
- All 5 /health endpoints return 200 with correct service name
- Gateway /v1/ingest + /v1/sql stubs return 501 with X-Lakehouse-Stub
- Graceful shutdown logs cleanly on SIGTERM
- DuckDB cgo path verified separately (sql.Open("duckdb","") + ping)

D1 ACCEPTANCE GATE: PASSED.

Next: D2 — storaged S3 GET/PUT/LIST against MinIO.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 07:00:37 -05:00

378 lines
21 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase G0 Kickoff Plan
**Goal:** smallest end-to-end ingest+query path in Go. Upload a CSV
via `POST /ingest`, query it via `POST /sql`, get rows back. No
vector, no profile, no UI yet.
**Estimated duration:** 1 engineer-week (5 working days + gate day +
cleanup). Plan is calibrated for solo work; cut by ~40% with two
engineers in parallel on storaged/catalogd vs ingestd/queryd.
**Cutoff for G0:** the closing acceptance gate (Day 6) passes
end-to-end against `workers_500k.csv` as the test fixture.
---
## Day 0 — One-time setup
Done by an operator with sudo on the dev box. ~15 minutes.
| # | Step | Verify |
|---|---|---|
| 0.1 | Install Go 1.23+: `curl -L https://go.dev/dl/go1.23.linux-amd64.tar.gz \| sudo tar -C /usr/local -xz` | `go version` shows 1.23+ |
| 0.2 | Add `/usr/local/go/bin` to `PATH` (in `~/.bashrc`) | new shell sees `go` |
| 0.3 | Install cgo toolchain: `apt install build-essential` | `gcc --version` works |
| 0.4 | Clone repo: `git clone https://git.agentview.dev/profit/golangLAKEHOUSE.git` | `cd golangLAKEHOUSE && go version` from inside |
| 0.5 | Bring up MinIO locally (or point at existing) | `mc ls local/` lists buckets, or whatever the dev S3 is |
| 0.6 | Verify DuckDB cgo path with a real compile-and-run smoke (NOT `go install pkg@latest` — that requires a main package and would fail on the duckdb library before exercising cgo). Steps: `mkdir /tmp/duckdb-smoke && cd /tmp/duckdb-smoke && go mod init smoke && go get github.com/duckdb/duckdb-go/v2 && cat > main.go <<<'package main; import (_ "github.com/duckdb/duckdb-go/v2"; "database/sql"); func main(){db,_:=sql.Open("duckdb","");db.Ping();db.Close()}' && go run main.go` — proves cgo linker chain + static-linked duckdb-go-bindings work on this platform | exits 0, no link errors |
**Day 0 acceptance:** `go version` shows 1.23+, `gcc --version` works,
MinIO reachable on `localhost:9000`, the cgo smoke install above
succeeded. (`go mod tidy` is intentionally NOT run here — no imports
yet; verification moves to D1.)
---
## Day 1 — Skeleton + chi + /health × 5 binaries
**Goal:** five binaries build, each binds to its port, `/health`
returns `{"status":"ok","service":"<name>"}`.
| # | File | What |
|---|---|---|
| 1.1 | `internal/shared/server.go` | chi router factory, slog setup, graceful shutdown via `signal.NotifyContext` |
| 1.2 | `internal/shared/config.go` | TOML loader using `pelletier/go-toml/v2`, default + override pattern |
| 1.3 | `cmd/gateway/main.go` | port **3110**, `/health` |
| 1.4 | `cmd/storaged/main.go` | port **3211**, `/health` |
| 1.5 | `cmd/catalogd/main.go` | port **3212**, `/health` |
| 1.6 | `cmd/ingestd/main.go` | port **3213**, `/health` |
| 1.7 | `cmd/queryd/main.go` | port **3214**, `/health` |
**Port shift note:** Original SPEC said 3100/32013204; D1 runtime
caught a collision — the live Rust lakehouse owns 3100. Shifted Go
dev ports to 3110+ so both systems can run concurrently during the
migration. G5 cutover flips gateway back to 3100 when Rust retires.
| 1.8 | `lakehouse.toml` | bind addresses, log level — sample committed |
| 1.9 | `Makefile` | `build`, `run-gateway`, etc. — convenience |
| 1.10 | `cmd/gateway/main.go` adds STUB routes `POST /v1/ingest` and `POST /v1/sql` returning `501 Not Implemented` with a header `X-Lakehouse-Stub: g0`. Real reverse-proxy wiring lands on Day 6, but the routes exist from D1 so D6 is just behavior change, not new endpoints. | `curl -X POST :3100/v1/ingest` returns 501 with the stub header |
**Acceptance D1:** `go mod tidy` populates `go.sum` cleanly; `go build
./cmd/...` exits 0; running each binary in a separate terminal,
`curl :3100/health` through `:3204/health` all return `200 OK` with
the expected JSON; gateway's stub `/v1/*` routes return 501.
**Dependencies pulled:** `go-chi/chi/v5`, `pelletier/go-toml/v2`.
---
## Day 2 — storaged: S3 GET/PUT/LIST
**Goal:** put a file, get it back, list it.
| # | File | What |
|---|---|---|
| 2.1 | `internal/storaged/bucket.go` | `aws-sdk-go-v2/service/s3` wrapper — `Get`, `Put`, `List`, `Delete` |
| 2.2 | `internal/storaged/registry.go` | `BucketRegistry` skeleton (per Rust ADR-017) — single bucket only in G0; multi-bucket lands in G2 |
| 2.3 | `internal/secrets/provider.go` | `SecretsProvider` interface + `FileSecretsProvider` reading `/etc/lakehouse/secrets.toml` |
| 2.4 | `cmd/storaged/main.go` | wire routes — `GET /storage/get/{key}`, `PUT /storage/put/{key}`, `GET /storage/list?prefix=...`. Bind to `127.0.0.1:3201` only (G0 is dev-only, no auth). Apply `http.MaxBytesReader` with a **256 MiB per-request cap** (reduced from 2 GiB per Qwen3-coder Q1 — 2 GiB × N concurrent = blow the box) + a **buffered semaphore** capping concurrent in-flight PUTs at 4. PUTs exceeding the cap → 413; PUTs blocked on the semaphore → 503 with `Retry-After: 5` |
**Acceptance D2:** `curl -T sample.csv 127.0.0.1:3201/storage/put/test/sample.csv`
returns 200; `curl 127.0.0.1:3201/storage/get/test/sample.csv` echoes
the file bytes; `curl 127.0.0.1:3201/storage/list?prefix=test/` lists
`sample.csv`; PUT exceeding 2 GiB returns `413 Payload Too Large`.
**Open question:** error journal (Rust ADR-018 append-log pattern) —
defer to G2 with multi-bucket federation, or wire it now? Plan says
defer; revisit if errors surface during D3-D5.
---
## Day 3 — catalogd: Parquet manifests
**Goal:** register a dataset, persist to storaged, restart, manifest
still visible.
| # | File | What |
|---|---|---|
| 3.1 | `internal/catalogd/manifest.go` | Parquet read/write using `arrow-go/v18/parquet/pqarrow`. Schema: `dataset_id`, `name`, `schema_fingerprint`, `objects`, `created_at`, `updated_at`, `row_count` |
| 3.2 | `internal/catalogd/registry.go` | In-memory index (map[name]Manifest), rehydrated on startup from `primary://_catalog/manifests/*.parquet` |
| 3.3 | `cmd/catalogd/main.go` | wire routes — `POST /catalog/register` (idempotent by name + fingerprint per Rust ADR-020), `GET /catalog/manifest/{name}`, `GET /catalog/list` |
| 3.4 | `internal/catalogd/store_client.go` | thin HTTP client to `cmd/storaged` — round-trips manifest Parquets |
**Acceptance D3:** register a dataset, see it in `/catalog/list`,
restart catalogd, `/catalog/list` still shows it. Re-register same
name + same fingerprint → 200, same `dataset_id`. Different
fingerprint → 409 Conflict.
---
## Day 4 — ingestd: CSV → Parquet → catalog
**Goal:** `POST /ingest` with a CSV file produces a Parquet in
storaged + a manifest in catalogd.
| # | File | What |
|---|---|---|
| 4.1 | `internal/ingestd/schema.go` | infer Arrow schema from CSV header + first-N-row sampling. ADR-010 default-to-string on ambiguity |
| 4.2 | `internal/ingestd/csv.go` | stream CSV → `array.RecordBatch` → Parquet (arrow-go pqarrow writer) |
| 4.3 | `cmd/ingestd/main.go` | route `POST /ingest` — multipart form file → schema infer → write Parquet → call catalogd to register |
**Acceptance D4:** `curl -F file=@workers_500k.csv :3203/ingest?name=workers_500k`
returns 200 with the registered manifest; `aws s3 ls` (or `mc ls`)
shows the Parquet under `primary://datasets/workers_500k/`;
`curl :3202/catalog/manifest/workers_500k` returns the manifest with
`row_count=500000`.
---
## Day 5 — queryd: DuckDB SELECT
**Goal:** SQL queries over Parquet datasets.
| # | File | What |
|---|---|---|
| 5.1 | `internal/queryd/db.go` | `database/sql` connection to `github.com/duckdb/duckdb-go/v2` (cgo). Ensures DuckDB extensions Parquet + httpfs are loaded; on connection open, executes `CREATE SECRET` (TYPE S3) populated from `internal/secrets/provider.go` so `read_parquet('s3://...')` against MinIO authenticates per session |
| 5.2 | `internal/queryd/registrar.go` | reads catalogd `/catalog/list`, registers each dataset as a DuckDB view: `CREATE VIEW workers_500k AS SELECT * FROM read_parquet('s3://...')` |
| 5.3 | `cmd/queryd/main.go` | route `POST /sql` (JSON body `{"sql": "..."}`). View refresh strategy: cache views with a TTL (default 30s) + invalidate on `If-None-Match` against catalogd's manifest etag. **Don't** re-CREATE on every request — Opus review flagged that as the perf cliff during D6 timing capture |
**Acceptance D5:** after Day 4 ingestion,
`curl -X POST -d '{"sql":"SELECT count(*) FROM workers_500k"}' :3204/sql`
returns `[{"count_star()":500000}]`. A `SELECT role, count(*) FROM
workers_500k WHERE state='IL' GROUP BY role` returns expected rows.
---
## Day 6 — Gate day: end-to-end via gateway
**Goal:** the closing G0 acceptance gate passes.
| # | What |
|---|---|
| 6.1 | Promote `cmd/gateway/main.go` `/v1/ingest` + `/v1/sql` from D1.10 stubs (501) to real reverse-proxies. `httputil.NewSingleHostReverseProxy` preserves the inbound path by default, so the proxy must use a custom `Director` (or `Rewrite`) that strips the `/v1` prefix before forwarding — otherwise the call lands on `ingestd:3203/v1/ingest` which doesn't exist (ingestd serves `/ingest`, queryd serves `/sql`). Multipart forwarding for `/v1/ingest` is the riskiest hop — verify form parts pass through with the file body intact |
| 6.2 | Smoke script `scripts/g0_smoke.sh`: spin up MinIO + 5 services, ingest, query, assert row count |
| 6.3 | Run smoke against `workers_500k.csv` end-to-end |
| 6.4 | Capture timing — total ingest + query latency, file size, peak memory |
**Closing G0 acceptance:** `scripts/g0_smoke.sh` exits 0. Numbers
recorded in `docs/G0_BASELINE.md` for future regression comparison.
---
## Day 7 — Cleanup + retro
| # | What |
|---|---|
| 7.1 | Update SPEC §4 G0 with what actually shipped vs planned (deviations, surprises) |
| 7.2 | Write `docs/G0_BASELINE.md` — measured perf numbers + comparison hooks for G1+ |
| 7.3 | **Finalize** ADRs that were stubbed *before* their decisions landed — ADR stubs go in at the start of D4 (arrow-go version pin, schema inference policy) and D5 (DuckDB extension load order, S3 secret provisioning, view-refresh TTL) so reviewers can object in-flight; D7 just commits them after running real code against the calls |
| 7.4 | Tag the commit `phase-g0-complete` |
| 7.5 | Open follow-up issues for anything punted (error journal, multi-bucket, profile system, two-phase-write orphan GC, shared-server.go refactor for cgo-handle services) |
---
## Risks tracked across the week
| Risk | Where | Mitigation |
|---|---|---|
| cgo build fails on the dev box | D5 | D0.3 verifies `gcc` present; if cgo specifically breaks, fall back to DuckDB external process (SPEC §3.1 option B) |
| arrow-go pqarrow schema mismatch with CSV inference | D4 | Sample 1k rows for type inference, default to String per ADR-010, log when defaulting |
| DuckDB can't read S3 Parquet directly | D5 | Load `httpfs` extension explicitly; if it fails, copy Parquet to a local temp file before query (slow but correct) |
| `/catalog/register` race between ingestd writer and catalogd reader | D3-D4 | Same write-lock-across-storage-write pattern as Rust ADR-020 — serialize registers; throughput is OK at low ingest QPS |
| `workers_500k.csv` schema drifts vs Rust era | D4 | Plan calls for inferring fresh, not porting Rust schema. If staffer-domain features break in G3+, revisit |
---
## Out of scope for G0 (deferred to later phases)
- Vector indexing — Phase G1
- Multi-bucket / federation — Phase G2
- Profile system — Phase G2
- Hot-swap atomicity — Phase G2
- Pathway memory — Phase G3
- Distillation pipeline — Phase G3
- MCP server / observer / auditor — Phase G4
- HTMX UI — Phase G5
- TLS, auth — explicit non-goal until G2 (single-bucket no-auth dev)
---
## Open questions before Day 1
1. **MinIO instance** — reuse the existing one at `localhost:9000`
that lakehouse uses (shared dev box) or stand up a fresh one with
a separate bucket prefix?
2. **`/etc/lakehouse/secrets.toml`** — share the lakehouse repo's
secrets file or create `/etc/golangLAKEHOUSE/secrets.toml`?
3. **Workers CSV source** — derive from `workers_500k.parquet` (round-
trip back to CSV) or use `workers_500k_v9.csv` if it exists?
These are ops calls, not architecture. Answer when D0 is being
executed.
---
## Self-review — independent pass via gateway overseer
Reviewer: `opencode/claude-opus-4-7` via `localhost:3100/v1/chat`
(the same path the production overseer correction loop uses post-G0
in the Rust era). Run on the original draft before any of the inline
fixes above were applied. Findings dispositioned below.
### BLOCK — both real, both fixed inline
| # | Finding | Disposition | Fix location |
|---|---|---|---|
| B1 | `apt install build-essential` alone won't satisfy the cgo link step for `duckdb-go/v2` | **Fixed** — D0.6 now runs a smoke `go install` against an empty module on D0 to flush platform issues here, not on D5 | D0.6 |
| B2 | DuckDB session needs S3 credentials (`CREATE SECRET`) plumbed from SecretsProvider; "load httpfs" alone leaves auth unwired | **Fixed** — D5.1 now calls `CREATE SECRET (TYPE S3, ...)` on connection open, populated from `internal/secrets/provider.go` | D5.1 |
### WARN — 4 of 5 fixed inline; 1 deferred
| # | Finding | Disposition | Fix location |
|---|---|---|---|
| W1 | Two-phase write (storaged → catalogd register) leaves orphan Parquets on partial failure; no GC story | **Deferred** — punted to G2 alongside multi-bucket + error journal; tracked in §Risks and D7.5 follow-up | D7.5 |
| W2 | "Refresh views on each `/sql` call" will be the D6 perf cliff | **Fixed** — D5.3 now uses TTL-cached views with etag invalidation against catalogd | D5.3 |
| W3 | Shared `internal/shared/server.go` factory across heterogeneous binaries (HTTP ingress vs cgo-DB-holder) couples graceful-shutdown semantics that will need unwinding later | **Accepted with note** — G0 keeps the simple shared factory; refactor explicitly listed as a G1+ follow-up | D7.5 |
| W4 | storaged PUT/GET on a TCP port with no auth + no body cap is a footgun | **Fixed** — D2.4 now binds `127.0.0.1` only and applies a 2 GiB `MaxBytesReader` cap | D2.4 |
| W5 | Gateway reverse-proxy introduced cold on D6 gate day compresses risk into the deadline | **Fixed** — D1.10 now stubs the routes returning 501; D6.1 just promotes them to real proxies | D1.10 + D6.1 |
### INFO — both fixed inline
| # | Finding | Disposition | Fix location |
|---|---|---|---|
| I1 | `go mod tidy` before any imports is a trivial-true verification | **Fixed** — D0.6 re-purposed for the cgo smoke; tidy verification moved to D1 acceptance | D0.6 + D1 acceptance |
| I2 | Filing ADRs *after* the work is done inverts the usual pattern | **Fixed** — D7.3 reframed: ADR stubs go in at the start of D4/D5 so reviewers can object in-flight; D7.3 just finalizes them | D7.3 |
### Net change (Opus pass)
7 of 9 findings produced inline plan edits; 2 deferred to post-G0
follow-up issues (W1 orphan GC, W3 shared-server refactor) with the
deferral itself documented. No findings dismissed as confabulation.
---
## Self-review — second pass via Kimi K2.6 (cross-lineage)
Reviewer: `opencode/kimi-k2.6` via the same gateway path. Run on the
post-Opus-fix doc to surface what a different model lineage catches.
Per the Rust-era cross-lineage rotation pattern, Kimi tends to ground
on textual specifics where Opus surfaces architectural shape — both
lenses are useful.
Kimi's output was discursive (pre-format thinking rather than the
requested BLOCK/WARN/INFO bullets), but two concrete catches landed —
both BLOCKs that Opus missed and that the Opus-pass fixes themselves
introduced.
### BLOCK — both real, both fixed inline
| # | Finding | Disposition | Fix location |
|---|---|---|---|
| K1 | Opus's BLOCK B1 fix used `go install github.com/duckdb/duckdb-go/v2@latest` to verify cgo, but `go install pkg@version` requires a `main` package — duckdb-go/v2 is a library, so the command fails with "not a main package" *before* the cgo linker chain is exercised. The verification can pass on a broken-cgo box. | **Fixed** — D0.6 now creates a temporary module + a 5-line `main.go` that imports duckdb-go/v2 and calls `sql.Open("duckdb","")`, then `go run`s it — actually compiles + executes with cgo | D0.6 |
| K2 | Path mismatch: D1.10 stubs `/v1/ingest` and `/v1/sql` on the gateway, but D4 has ingestd serving `/ingest` and D5 has queryd serving `/sql`. D6.1's `httputil.NewSingleHostReverseProxy` preserves the inbound path by default, so the proxy would forward to `ingestd:3203/v1/ingest` which doesn't exist. Smoke would fail on D6 with a 404 on the backend, not the gateway. | **Fixed** — D6.1 now specifies a custom `Director` that strips the `/v1` prefix before forwarding | D6.1 |
### WARN — none extracted (Kimi response truncated at this section)
The remaining response stream considered D2.4 `MaxBytesReader` semantics
and D3 manifest registry concurrency, but cut off before producing
structured findings. A third pass could be run if more lineage
coverage is wanted — for now, two BLOCKs fixed is the cross-lineage
delta.
### Net change (Kimi pass)
2 BLOCKs landed, both fixed inline. Both were *introduced by Opus-pass
fixes* — illustrating exactly why cross-lineage rotation matters: one
model's review of the original is not the same as a different model's
review of the post-fix version. The Rust auditor's Kimi/Haiku/Opus
rotation captures this dynamic; today's two-pass doc review reproduces
it on a much smaller scale.
---
## Self-review — third pass via Qwen3-coder:480b (cross-lineage)
Reviewer: `qwen3-coder:480b` via `ollama_cloud` (Ollama Pro). Largest
single-model open-weights coder in the fleet. Run on the post-Kimi-fix
doc.
### Output style note
Qwen3-coder's response was 80% **approval of prior fixes** (citing
B1/B2/K1/K2/I2 as "NEW BLOCKs/INFOs" — actually a misuse of the NEW
label, but useful as triangulation: three independent lineages now
agree the prior fixes are correct). Only 2 genuinely new WARNs.
### WARN — 1 fixed inline, 1 deferred
| # | Finding | Disposition | Fix location |
|---|---|---|---|
| Q1 | D2.4 `MaxBytesReader` at 2 GiB still permits memory exhaustion under concurrent uploads — N concurrent × 2 GiB blows the box | **Fixed** — D2.4 cap reduced to 256 MiB per request, plus a 4-slot semaphore on concurrent in-flight PUTs (503 + Retry-After when full) | D2.4 |
| Q2 | D5.3 view refresh has TTL + etag invalidation but no batching, so bursty queries against many datasets can re-issue catalog reads redundantly | **Deferred** — minor under G0's single-tenant load; revisit in G2 alongside the multi-bucket / profile work that creates more catalog churn | D7.5 |
### Net change (Qwen pass)
1 fixed inline, 1 deferred. The mislabeled "approval" output was net
positive — it's the closest thing to a triangulation signal we get
without a fourth model. With Opus + Kimi + Qwen all confirming the
prior fixes hold, the plan is unusually well-cross-checked for a
greenfield kickoff.
### Cumulative disposition across all 3 lineages
| Pass | Reviewer | New findings | Fixed | Deferred |
|---|---|---|---|---|
| 1 | `opencode/claude-opus-4-7` | 9 | 7 | 2 |
| 2 | `opencode/kimi-k2.6` | 2 | 2 | 0 |
| 3 | `qwen3-coder:480b` (`ollama_cloud`) | 2 | 1 | 1 |
| 4 | **runtime smoke** (D1 actual launch) | 1 | 1 | 0 |
| **Total** | | **14** | **11** | **3** |
The 4th pass — runtime smoke — caught the only thing three doc-review
lineages couldn't: port 3100 was already occupied by the live Rust
lakehouse. Documented and dispositioned same pattern as the model
findings.
Three deferrals, all to G2: orphan GC on two-phase write (W1),
shared-server refactor for cgo-handle services (W3), batched view
refresh (Q2). Tracked in D7.5.
---
## D1 — actual run results (2026-04-28)
Phase G0 Day 1 executed end-to-end. Output of `scripts/d1_smoke.sh`:
```
[d1-smoke] building...
[d1-smoke] launching...
[d1-smoke] /health probes:
✓ gateway (:3110) → {"status":"ok","service":"gateway"}
✓ storaged (:3211) → {"status":"ok","service":"storaged"}
✓ catalogd (:3212) → {"status":"ok","service":"catalogd"}
✓ ingestd (:3213) → {"status":"ok","service":"ingestd"}
✓ queryd (:3214) → {"status":"ok","service":"queryd"}
[d1-smoke] gateway 501 stub probes:
✓ POST /v1/ingest → 501 + X-Lakehouse-Stub: g0
✓ POST /v1/sql → 501 + X-Lakehouse-Stub: g0
[d1-smoke] D1 acceptance gate: PASSED
```
What landed:
- `internal/shared/server.go` — chi factory, slog JSON logging,
`/health`, graceful shutdown via `signal.NotifyContext`
- `internal/shared/config.go` — TOML loader with `DefaultConfig()`
and env-override-via-flag pattern (`-config` flag)
- `cmd/{gateway,storaged,catalogd,ingestd,queryd}/main.go` — five
binaries, each ~30 lines, all using the shared factory
- `lakehouse.toml` — G0 dev config with the shifted 3110+ ports
- `scripts/d1_smoke.sh` — repeatable smoke test, exits 0 on PASS
- `bin/{gateway,storaged,catalogd,ingestd,queryd}` — built binaries
(~9.7 MB each, no debug stripping)
What G0 didn't need but I added anyway (intentional):
- Gateway already has the D1.10 stub routes wired; D6.1 is just
swap-the-handler.
- TOML config supports an S3 section even though storaged doesn't
consume it until D2 — saves a config schema bump on D2.
Next: D2 — storaged's actual S3 GET/PUT/LIST routes against MinIO.