# Phase G0 Kickoff Plan **Goal:** smallest end-to-end ingest+query path in Go. Upload a CSV via `POST /ingest`, query it via `POST /sql`, get rows back. No vector, no profile, no UI yet. **Estimated duration:** 1 engineer-week (5 working days + gate day + cleanup). Plan is calibrated for solo work; cut by ~40% with two engineers in parallel on storaged/catalogd vs ingestd/queryd. **Cutoff for G0:** the closing acceptance gate (Day 6) passes end-to-end against `workers_500k.csv` as the test fixture. --- ## Day 0 — One-time setup Done by an operator with sudo on the dev box. ~15 minutes. | # | Step | Verify | |---|---|---| | 0.1 | Install Go 1.23+: `curl -L https://go.dev/dl/go1.23.linux-amd64.tar.gz \| sudo tar -C /usr/local -xz` | `go version` shows 1.23+ | | 0.2 | Add `/usr/local/go/bin` to `PATH` (in `~/.bashrc`) | new shell sees `go` | | 0.3 | Install cgo toolchain: `apt install build-essential` | `gcc --version` works | | 0.4 | Clone repo: `git clone https://git.agentview.dev/profit/golangLAKEHOUSE.git` | `cd golangLAKEHOUSE && go version` from inside | | 0.5 | Bring up MinIO locally (or point at existing) | `mc ls local/` lists buckets, or whatever the dev S3 is | | 0.6 | Verify DuckDB cgo path with a real compile-and-run smoke (NOT `go install pkg@latest` — that requires a main package and would fail on the duckdb library before exercising cgo). Steps: `mkdir /tmp/duckdb-smoke && cd /tmp/duckdb-smoke && go mod init smoke && go get github.com/duckdb/duckdb-go/v2 && cat > main.go <<<'package main; import (_ "github.com/duckdb/duckdb-go/v2"; "database/sql"); func main(){db,_:=sql.Open("duckdb","");db.Ping();db.Close()}' && go run main.go` — proves cgo linker chain + static-linked duckdb-go-bindings work on this platform | exits 0, no link errors | **Day 0 acceptance:** `go version` shows 1.23+, `gcc --version` works, MinIO reachable on `localhost:9000`, the cgo smoke install above succeeded. (`go mod tidy` is intentionally NOT run here — no imports yet; verification moves to D1.) --- ## Day 1 — Skeleton + chi + /health × 5 binaries **Goal:** five binaries build, each binds to its port, `/health` returns `{"status":"ok","service":""}`. | # | File | What | |---|---|---| | 1.1 | `internal/shared/server.go` | chi router factory, slog setup, graceful shutdown via `signal.NotifyContext` | | 1.2 | `internal/shared/config.go` | TOML loader using `pelletier/go-toml/v2`, default + override pattern | | 1.3 | `cmd/gateway/main.go` | port **3110**, `/health` | | 1.4 | `cmd/storaged/main.go` | port **3211**, `/health` | | 1.5 | `cmd/catalogd/main.go` | port **3212**, `/health` | | 1.6 | `cmd/ingestd/main.go` | port **3213**, `/health` | | 1.7 | `cmd/queryd/main.go` | port **3214**, `/health` | **Port shift note:** Original SPEC said 3100/3201–3204; D1 runtime caught a collision — the live Rust lakehouse owns 3100. Shifted Go dev ports to 3110+ so both systems can run concurrently during the migration. G5 cutover flips gateway back to 3100 when Rust retires. | 1.8 | `lakehouse.toml` | bind addresses, log level — sample committed | | 1.9 | `Makefile` | `build`, `run-gateway`, etc. — convenience | | 1.10 | `cmd/gateway/main.go` adds STUB routes `POST /v1/ingest` and `POST /v1/sql` returning `501 Not Implemented` with a header `X-Lakehouse-Stub: g0`. Real reverse-proxy wiring lands on Day 6, but the routes exist from D1 so D6 is just behavior change, not new endpoints. | `curl -X POST :3100/v1/ingest` returns 501 with the stub header | **Acceptance D1:** `go mod tidy` populates `go.sum` cleanly; `go build ./cmd/...` exits 0; running each binary in a separate terminal, `curl :3100/health` through `:3204/health` all return `200 OK` with the expected JSON; gateway's stub `/v1/*` routes return 501. **Dependencies pulled:** `go-chi/chi/v5`, `pelletier/go-toml/v2`. --- ## Day 2 — storaged: S3 GET/PUT/LIST **Goal:** put a file, get it back, list it. | # | File | What | |---|---|---| | 2.1 | `internal/storaged/bucket.go` | `aws-sdk-go-v2/service/s3` wrapper — `Get`, `Put`, `List`, `Delete` | | 2.2 | `internal/storaged/registry.go` | `BucketRegistry` skeleton (per Rust ADR-017) — single bucket only in G0; multi-bucket lands in G2 | | 2.3 | `internal/secrets/provider.go` | `SecretsProvider` interface + `FileSecretsProvider` reading `/etc/lakehouse/secrets.toml` | | 2.4 | `cmd/storaged/main.go` | wire routes — `GET /storage/get/{key}`, `PUT /storage/put/{key}`, `GET /storage/list?prefix=...`. Bind to `127.0.0.1:3201` only (G0 is dev-only, no auth). Apply `http.MaxBytesReader` with a **256 MiB per-request cap** (reduced from 2 GiB per Qwen3-coder Q1 — 2 GiB × N concurrent = blow the box) + a **buffered semaphore** capping concurrent in-flight PUTs at 4. PUTs exceeding the cap → 413; PUTs blocked on the semaphore → 503 with `Retry-After: 5` | **Acceptance D2:** `curl -T sample.csv 127.0.0.1:3201/storage/put/test/sample.csv` returns 200; `curl 127.0.0.1:3201/storage/get/test/sample.csv` echoes the file bytes; `curl 127.0.0.1:3201/storage/list?prefix=test/` lists `sample.csv`; PUT exceeding 2 GiB returns `413 Payload Too Large`. **Open question:** error journal (Rust ADR-018 append-log pattern) — defer to G2 with multi-bucket federation, or wire it now? Plan says defer; revisit if errors surface during D3-D5. --- ## Day 3 — catalogd: Parquet manifests **Goal:** register a dataset, persist to storaged, restart, manifest still visible. | # | File | What | |---|---|---| | 3.1 | `internal/catalogd/manifest.go` | Parquet read/write using `arrow-go/v18/parquet/pqarrow`. Schema: `dataset_id`, `name`, `schema_fingerprint`, `objects`, `created_at`, `updated_at`, `row_count` | | 3.2 | `internal/catalogd/registry.go` | In-memory index (map[name]Manifest), rehydrated on startup from `primary://_catalog/manifests/*.parquet` | | 3.3 | `cmd/catalogd/main.go` | wire routes — `POST /catalog/register` (idempotent by name + fingerprint per Rust ADR-020), `GET /catalog/manifest/{name}`, `GET /catalog/list` | | 3.4 | `internal/catalogd/store_client.go` | thin HTTP client to `cmd/storaged` — round-trips manifest Parquets | **Acceptance D3:** register a dataset, see it in `/catalog/list`, restart catalogd, `/catalog/list` still shows it. Re-register same name + same fingerprint → 200, same `dataset_id`. Different fingerprint → 409 Conflict. --- ## Day 4 — ingestd: CSV → Parquet → catalog **Goal:** `POST /ingest` with a CSV file produces a Parquet in storaged + a manifest in catalogd. | # | File | What | |---|---|---| | 4.1 | `internal/ingestd/schema.go` | infer Arrow schema from CSV header + first-N-row sampling. ADR-010 default-to-string on ambiguity | | 4.2 | `internal/ingestd/csv.go` | stream CSV → `array.RecordBatch` → Parquet (arrow-go pqarrow writer) | | 4.3 | `cmd/ingestd/main.go` | route `POST /ingest` — multipart form file → schema infer → write Parquet → call catalogd to register | **Acceptance D4:** `curl -F file=@workers_500k.csv :3203/ingest?name=workers_500k` returns 200 with the registered manifest; `aws s3 ls` (or `mc ls`) shows the Parquet under `primary://datasets/workers_500k/`; `curl :3202/catalog/manifest/workers_500k` returns the manifest with `row_count=500000`. --- ## Day 5 — queryd: DuckDB SELECT **Goal:** SQL queries over Parquet datasets. | # | File | What | |---|---|---| | 5.1 | `internal/queryd/db.go` | `database/sql` connection to `github.com/duckdb/duckdb-go/v2` (cgo). Ensures DuckDB extensions Parquet + httpfs are loaded; on connection open, executes `CREATE SECRET` (TYPE S3) populated from `internal/secrets/provider.go` so `read_parquet('s3://...')` against MinIO authenticates per session | | 5.2 | `internal/queryd/registrar.go` | reads catalogd `/catalog/list`, registers each dataset as a DuckDB view: `CREATE VIEW workers_500k AS SELECT * FROM read_parquet('s3://...')` | | 5.3 | `cmd/queryd/main.go` | route `POST /sql` (JSON body `{"sql": "..."}`). View refresh strategy: cache views with a TTL (default 30s) + invalidate on `If-None-Match` against catalogd's manifest etag. **Don't** re-CREATE on every request — Opus review flagged that as the perf cliff during D6 timing capture | **Acceptance D5:** after Day 4 ingestion, `curl -X POST -d '{"sql":"SELECT count(*) FROM workers_500k"}' :3204/sql` returns `[{"count_star()":500000}]`. A `SELECT role, count(*) FROM workers_500k WHERE state='IL' GROUP BY role` returns expected rows. --- ## Day 6 — Gate day: end-to-end via gateway **Goal:** the closing G0 acceptance gate passes. | # | What | |---|---| | 6.1 | Promote `cmd/gateway/main.go` `/v1/ingest` + `/v1/sql` from D1.10 stubs (501) to real reverse-proxies. `httputil.NewSingleHostReverseProxy` preserves the inbound path by default, so the proxy must use a custom `Director` (or `Rewrite`) that strips the `/v1` prefix before forwarding — otherwise the call lands on `ingestd:3203/v1/ingest` which doesn't exist (ingestd serves `/ingest`, queryd serves `/sql`). Multipart forwarding for `/v1/ingest` is the riskiest hop — verify form parts pass through with the file body intact | | 6.2 | Smoke script `scripts/g0_smoke.sh`: spin up MinIO + 5 services, ingest, query, assert row count | | 6.3 | Run smoke against `workers_500k.csv` end-to-end | | 6.4 | Capture timing — total ingest + query latency, file size, peak memory | **Closing G0 acceptance:** `scripts/g0_smoke.sh` exits 0. Numbers recorded in `docs/G0_BASELINE.md` for future regression comparison. --- ## Day 7 — Cleanup + retro | # | What | |---|---| | 7.1 | Update SPEC §4 G0 with what actually shipped vs planned (deviations, surprises) | | 7.2 | Write `docs/G0_BASELINE.md` — measured perf numbers + comparison hooks for G1+ | | 7.3 | **Finalize** ADRs that were stubbed *before* their decisions landed — ADR stubs go in at the start of D4 (arrow-go version pin, schema inference policy) and D5 (DuckDB extension load order, S3 secret provisioning, view-refresh TTL) so reviewers can object in-flight; D7 just commits them after running real code against the calls | | 7.4 | Tag the commit `phase-g0-complete` | | 7.5 | Open follow-up issues for anything punted (error journal, multi-bucket, profile system, two-phase-write orphan GC, shared-server.go refactor for cgo-handle services) | --- ## Risks tracked across the week | Risk | Where | Mitigation | |---|---|---| | cgo build fails on the dev box | D5 | D0.3 verifies `gcc` present; if cgo specifically breaks, fall back to DuckDB external process (SPEC §3.1 option B) | | arrow-go pqarrow schema mismatch with CSV inference | D4 | Sample 1k rows for type inference, default to String per ADR-010, log when defaulting | | DuckDB can't read S3 Parquet directly | D5 | Load `httpfs` extension explicitly; if it fails, copy Parquet to a local temp file before query (slow but correct) | | `/catalog/register` race between ingestd writer and catalogd reader | D3-D4 | Same write-lock-across-storage-write pattern as Rust ADR-020 — serialize registers; throughput is OK at low ingest QPS | | `workers_500k.csv` schema drifts vs Rust era | D4 | Plan calls for inferring fresh, not porting Rust schema. If staffer-domain features break in G3+, revisit | --- ## Out of scope for G0 (deferred to later phases) - Vector indexing — Phase G1 - Multi-bucket / federation — Phase G2 - Profile system — Phase G2 - Hot-swap atomicity — Phase G2 - Pathway memory — Phase G3 - Distillation pipeline — Phase G3 - MCP server / observer / auditor — Phase G4 - HTMX UI — Phase G5 - TLS, auth — explicit non-goal until G2 (single-bucket no-auth dev) --- ## Open questions before Day 1 1. **MinIO instance** — reuse the existing one at `localhost:9000` that lakehouse uses (shared dev box) or stand up a fresh one with a separate bucket prefix? 2. **`/etc/lakehouse/secrets.toml`** — share the lakehouse repo's secrets file or create `/etc/golangLAKEHOUSE/secrets.toml`? 3. **Workers CSV source** — derive from `workers_500k.parquet` (round- trip back to CSV) or use `workers_500k_v9.csv` if it exists? These are ops calls, not architecture. Answer when D0 is being executed. --- ## Self-review — independent pass via gateway overseer Reviewer: `opencode/claude-opus-4-7` via `localhost:3100/v1/chat` (the same path the production overseer correction loop uses post-G0 in the Rust era). Run on the original draft before any of the inline fixes above were applied. Findings dispositioned below. ### BLOCK — both real, both fixed inline | # | Finding | Disposition | Fix location | |---|---|---|---| | B1 | `apt install build-essential` alone won't satisfy the cgo link step for `duckdb-go/v2` | **Fixed** — D0.6 now runs a smoke `go install` against an empty module on D0 to flush platform issues here, not on D5 | D0.6 | | B2 | DuckDB session needs S3 credentials (`CREATE SECRET`) plumbed from SecretsProvider; "load httpfs" alone leaves auth unwired | **Fixed** — D5.1 now calls `CREATE SECRET (TYPE S3, ...)` on connection open, populated from `internal/secrets/provider.go` | D5.1 | ### WARN — 4 of 5 fixed inline; 1 deferred | # | Finding | Disposition | Fix location | |---|---|---|---| | W1 | Two-phase write (storaged → catalogd register) leaves orphan Parquets on partial failure; no GC story | **Deferred** — punted to G2 alongside multi-bucket + error journal; tracked in §Risks and D7.5 follow-up | D7.5 | | W2 | "Refresh views on each `/sql` call" will be the D6 perf cliff | **Fixed** — D5.3 now uses TTL-cached views with etag invalidation against catalogd | D5.3 | | W3 | Shared `internal/shared/server.go` factory across heterogeneous binaries (HTTP ingress vs cgo-DB-holder) couples graceful-shutdown semantics that will need unwinding later | **Accepted with note** — G0 keeps the simple shared factory; refactor explicitly listed as a G1+ follow-up | D7.5 | | W4 | storaged PUT/GET on a TCP port with no auth + no body cap is a footgun | **Fixed** — D2.4 now binds `127.0.0.1` only and applies a 2 GiB `MaxBytesReader` cap | D2.4 | | W5 | Gateway reverse-proxy introduced cold on D6 gate day compresses risk into the deadline | **Fixed** — D1.10 now stubs the routes returning 501; D6.1 just promotes them to real proxies | D1.10 + D6.1 | ### INFO — both fixed inline | # | Finding | Disposition | Fix location | |---|---|---|---| | I1 | `go mod tidy` before any imports is a trivial-true verification | **Fixed** — D0.6 re-purposed for the cgo smoke; tidy verification moved to D1 acceptance | D0.6 + D1 acceptance | | I2 | Filing ADRs *after* the work is done inverts the usual pattern | **Fixed** — D7.3 reframed: ADR stubs go in at the start of D4/D5 so reviewers can object in-flight; D7.3 just finalizes them | D7.3 | ### Net change (Opus pass) 7 of 9 findings produced inline plan edits; 2 deferred to post-G0 follow-up issues (W1 orphan GC, W3 shared-server refactor) with the deferral itself documented. No findings dismissed as confabulation. --- ## Self-review — second pass via Kimi K2.6 (cross-lineage) Reviewer: `opencode/kimi-k2.6` via the same gateway path. Run on the post-Opus-fix doc to surface what a different model lineage catches. Per the Rust-era cross-lineage rotation pattern, Kimi tends to ground on textual specifics where Opus surfaces architectural shape — both lenses are useful. Kimi's output was discursive (pre-format thinking rather than the requested BLOCK/WARN/INFO bullets), but two concrete catches landed — both BLOCKs that Opus missed and that the Opus-pass fixes themselves introduced. ### BLOCK — both real, both fixed inline | # | Finding | Disposition | Fix location | |---|---|---|---| | K1 | Opus's BLOCK B1 fix used `go install github.com/duckdb/duckdb-go/v2@latest` to verify cgo, but `go install pkg@version` requires a `main` package — duckdb-go/v2 is a library, so the command fails with "not a main package" *before* the cgo linker chain is exercised. The verification can pass on a broken-cgo box. | **Fixed** — D0.6 now creates a temporary module + a 5-line `main.go` that imports duckdb-go/v2 and calls `sql.Open("duckdb","")`, then `go run`s it — actually compiles + executes with cgo | D0.6 | | K2 | Path mismatch: D1.10 stubs `/v1/ingest` and `/v1/sql` on the gateway, but D4 has ingestd serving `/ingest` and D5 has queryd serving `/sql`. D6.1's `httputil.NewSingleHostReverseProxy` preserves the inbound path by default, so the proxy would forward to `ingestd:3203/v1/ingest` which doesn't exist. Smoke would fail on D6 with a 404 on the backend, not the gateway. | **Fixed** — D6.1 now specifies a custom `Director` that strips the `/v1` prefix before forwarding | D6.1 | ### WARN — none extracted (Kimi response truncated at this section) The remaining response stream considered D2.4 `MaxBytesReader` semantics and D3 manifest registry concurrency, but cut off before producing structured findings. A third pass could be run if more lineage coverage is wanted — for now, two BLOCKs fixed is the cross-lineage delta. ### Net change (Kimi pass) 2 BLOCKs landed, both fixed inline. Both were *introduced by Opus-pass fixes* — illustrating exactly why cross-lineage rotation matters: one model's review of the original is not the same as a different model's review of the post-fix version. The Rust auditor's Kimi/Haiku/Opus rotation captures this dynamic; today's two-pass doc review reproduces it on a much smaller scale. --- ## Self-review — third pass via Qwen3-coder:480b (cross-lineage) Reviewer: `qwen3-coder:480b` via `ollama_cloud` (Ollama Pro). Largest single-model open-weights coder in the fleet. Run on the post-Kimi-fix doc. ### Output style note Qwen3-coder's response was 80% **approval of prior fixes** (citing B1/B2/K1/K2/I2 as "NEW BLOCKs/INFOs" — actually a misuse of the NEW label, but useful as triangulation: three independent lineages now agree the prior fixes are correct). Only 2 genuinely new WARNs. ### WARN — 1 fixed inline, 1 deferred | # | Finding | Disposition | Fix location | |---|---|---|---| | Q1 | D2.4 `MaxBytesReader` at 2 GiB still permits memory exhaustion under concurrent uploads — N concurrent × 2 GiB blows the box | **Fixed** — D2.4 cap reduced to 256 MiB per request, plus a 4-slot semaphore on concurrent in-flight PUTs (503 + Retry-After when full) | D2.4 | | Q2 | D5.3 view refresh has TTL + etag invalidation but no batching, so bursty queries against many datasets can re-issue catalog reads redundantly | **Deferred** — minor under G0's single-tenant load; revisit in G2 alongside the multi-bucket / profile work that creates more catalog churn | D7.5 | ### Net change (Qwen pass) 1 fixed inline, 1 deferred. The mislabeled "approval" output was net positive — it's the closest thing to a triangulation signal we get without a fourth model. With Opus + Kimi + Qwen all confirming the prior fixes hold, the plan is unusually well-cross-checked for a greenfield kickoff. ### Cumulative disposition across all 3 lineages | Pass | Reviewer | New findings | Fixed | Deferred | |---|---|---|---|---| | 1 | `opencode/claude-opus-4-7` | 9 | 7 | 2 | | 2 | `opencode/kimi-k2.6` | 2 | 2 | 0 | | 3 | `qwen3-coder:480b` (`ollama_cloud`) | 2 | 1 | 1 | | 4 | **runtime smoke** (D1 actual launch) | 1 | 1 | 0 | | **Total** | | **14** | **11** | **3** | The 4th pass — runtime smoke — caught the only thing three doc-review lineages couldn't: port 3100 was already occupied by the live Rust lakehouse. Documented and dispositioned same pattern as the model findings. Three deferrals, all to G2: orphan GC on two-phase write (W1), shared-server refactor for cgo-handle services (W3), batched view refresh (Q2). Tracked in D7.5. --- ## D1 — actual run results (2026-04-28) Phase G0 Day 1 executed end-to-end. Output of `scripts/d1_smoke.sh`: ``` [d1-smoke] building... [d1-smoke] launching... [d1-smoke] /health probes: ✓ gateway (:3110) → {"status":"ok","service":"gateway"} ✓ storaged (:3211) → {"status":"ok","service":"storaged"} ✓ catalogd (:3212) → {"status":"ok","service":"catalogd"} ✓ ingestd (:3213) → {"status":"ok","service":"ingestd"} ✓ queryd (:3214) → {"status":"ok","service":"queryd"} [d1-smoke] gateway 501 stub probes: ✓ POST /v1/ingest → 501 + X-Lakehouse-Stub: g0 ✓ POST /v1/sql → 501 + X-Lakehouse-Stub: g0 [d1-smoke] D1 acceptance gate: PASSED ``` What landed: - `internal/shared/server.go` — chi factory, slog JSON logging, `/health`, graceful shutdown via `signal.NotifyContext` - `internal/shared/config.go` — TOML loader with `DefaultConfig()` and env-override-via-flag pattern (`-config` flag) - `cmd/{gateway,storaged,catalogd,ingestd,queryd}/main.go` — five binaries, each ~30 lines, all using the shared factory - `lakehouse.toml` — G0 dev config with the shifted 3110+ ports - `scripts/d1_smoke.sh` — repeatable smoke test, exits 0 on PASS - `bin/{gateway,storaged,catalogd,ingestd,queryd}` — built binaries (~9.7 MB each, no debug stripping) What G0 didn't need but I added anyway (intentional): - Gateway already has the D1.10 stub routes wired; D6.1 is just swap-the-handler. - TOML config supports an S3 section even though storaged doesn't consume it until D2 — saves a config schema bump on D2. Next: D2 — storaged's actual S3 GET/PUT/LIST routes against MinIO.