Last day of Phase G0. Gateway promotes the D1 stub endpoints into
real reverse-proxies on :3110 fronting storaged + catalogd + ingestd
+ queryd. /v1 prefix lives at the edge — internal services route on
/storage, /catalog, /ingest, /sql, with the prefix stripped by a
custom Director per Kimi K2's D1-plan finding.
Routes:
/v1/storage/* → storaged
/v1/catalog/* → catalogd
/v1/ingest → ingestd
/v1/sql → queryd
Acceptance smoke 6/6 PASS — every assertion goes through :3110, none
direct to backing services. Full ingest → storage → catalog → query
round-trip verified end-to-end. The smoke's "rows[0].name=Alice"
assertion is the architectural payoff: five binaries, six HTTP
routes, one round-trip through one edge.
Cross-lineage scrum on shipped code:
- Opus 4.7 (opencode): 1 BLOCK + 2 WARN + 2 INFO
- Kimi K2-0905 (openrouter): 1 BLOCK + 3 WARN + 1 INFO (3 false positives, all from one wrong TrimPrefix theory)
- Qwen3-coder (openrouter): 5 completion tokens — "No BLOCKs."
Fixed (2, both Opus single-reviewer):
O-BLOCK: Director path stripping fails if upstream URL has a
non-empty path. The default Director's singleJoiningSlash runs
BEFORE the custom code, so an upstream like http://host/api
produces /api/v1/storage/... after the join — then TrimPrefix("/v1")
is a no-op because the string starts with /api. Fix: strip /v1
BEFORE calling origDirector. New TestProxy_SubPathUpstream regression
locks this in. Today: bare-host URLs only, dormant — but moving
gateway behind a sub-path in prod would have silently 404'd.
O-WARN2: url.Parse is permissive — typo "127.0.0.1:3211" (no scheme)
parses fine, produces empty Host, every request 502s. mustParseUpstream
fail-fast at startup with a clear message naming the offending
config field.
Dismissed (3, all Kimi, same false TrimPrefix theory):
K-BLOCK "TrimPrefix loops forever on //v1storage" — false, single
check-and-trim, no loop
K-WARN "no upper bound on repeated // removal" — same false theory
K-WARN "goroutines leak if upstream parse fails while binaries
running" — confused scope; binaries are separate OS processes
launched by the smoke script
D1 smoke updated (post-D6): the 501 stub probes are gone (gateway no
longer stubs /v1/ingest and /v1/sql). Replaced with proxy probes that
verify gateway forwards malformed requests to ingestd and queryd. Launch
order changed from parallel to dep-ordered (storaged → catalogd →
ingestd → queryd → gateway) since catalogd's rehydrate now needs
storaged, queryd's initial Refresh needs catalogd.
All six G0 smokes (D1 through D6) PASS end-to-end after every fix
round. Phase G0 substrate is complete: 5 binaries, 6 routes, 25 fixes
applied across 6 days from cross-lineage review.
G1+ next: gRPC adapters, Lance/HNSW vector indices, Go MCP SDK port,
distillation rebuild, observer + Langfuse integration.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
169 lines
5.8 KiB
Bash
Executable File
169 lines
5.8 KiB
Bash
Executable File
#!/usr/bin/env bash
|
|
# D6 smoke — proves the Day 6 acceptance gate end-to-end.
|
|
#
|
|
# Validates the gateway as a reverse proxy: every assertion goes
|
|
# through :3110 (gateway), NOT directly to the backing services.
|
|
# - /v1/health → gateway's own /health (no proxy)
|
|
# - /v1/ingest?name=X with multipart CSV → ingestd → storaged + catalogd
|
|
# - /v1/sql with SELECT count(*) → queryd
|
|
# - /v1/catalog/list → catalogd
|
|
# - /v1/storage/list → storaged
|
|
# - /v1/<unknown> → 404 (not the gateway's job to mediate; chi rejects)
|
|
#
|
|
# Requires storaged + catalogd + ingestd + queryd + gateway up.
|
|
#
|
|
# Usage: ./scripts/d6_smoke.sh
|
|
|
|
set -euo pipefail
|
|
cd "$(dirname "$0")/.."
|
|
|
|
export PATH="$PATH:/usr/local/go/bin"
|
|
|
|
echo "[d6-smoke] building all 5 binaries..."
|
|
go build -o bin/ ./cmd/storaged ./cmd/catalogd ./cmd/ingestd ./cmd/queryd ./cmd/gateway
|
|
|
|
# Cleanup any prior processes.
|
|
pkill -f "bin/storaged" 2>/dev/null || true
|
|
pkill -f "bin/catalogd" 2>/dev/null || true
|
|
pkill -f "bin/ingestd" 2>/dev/null || true
|
|
pkill -f "bin/queryd" 2>/dev/null || true
|
|
pkill -f "bin/gateway" 2>/dev/null || true
|
|
sleep 0.3
|
|
|
|
PIDS=()
|
|
TMP="$(mktemp -d)"
|
|
cleanup() {
|
|
echo "[d6-smoke] cleanup"
|
|
for p in "${PIDS[@]}"; do
|
|
[ -n "$p" ] && kill "$p" 2>/dev/null || true
|
|
done
|
|
rm -rf "$TMP"
|
|
}
|
|
trap cleanup EXIT INT TERM
|
|
|
|
poll_health() {
|
|
local port="$1" deadline=$(($(date +%s) + 5))
|
|
while [ "$(date +%s)" -lt "$deadline" ]; do
|
|
if curl -sS --max-time 1 "http://127.0.0.1:$port/health" >/dev/null 2>&1; then return 0; fi
|
|
sleep 0.05
|
|
done
|
|
return 1
|
|
}
|
|
|
|
echo "[d6-smoke] launching storaged → catalogd → ingestd..."
|
|
./bin/storaged > /tmp/storaged.log 2>&1 &
|
|
PIDS+=($!)
|
|
poll_health 3211 || { echo "storaged failed"; tail -10 /tmp/storaged.log; exit 1; }
|
|
|
|
# Clean any prior smoke artifacts.
|
|
NAME="d6_workers"
|
|
for k in $(curl -sS "http://127.0.0.1:3211/storage/list?prefix=_catalog/manifests/" | jq -r '.objects[]?.Key // empty' 2>/dev/null); do
|
|
curl -sS -o /dev/null -X DELETE "http://127.0.0.1:3211/storage/delete/$k" || true
|
|
done
|
|
for k in $(curl -sS "http://127.0.0.1:3211/storage/list?prefix=datasets/$NAME/" | jq -r '.objects[]?.Key // empty' 2>/dev/null); do
|
|
curl -sS -o /dev/null -X DELETE "http://127.0.0.1:3211/storage/delete/$k" || true
|
|
done
|
|
|
|
./bin/catalogd > /tmp/catalogd.log 2>&1 &
|
|
PIDS+=($!)
|
|
poll_health 3212 || { echo "catalogd failed"; tail -10 /tmp/catalogd.log; exit 1; }
|
|
|
|
./bin/ingestd > /tmp/ingestd.log 2>&1 &
|
|
PIDS+=($!)
|
|
poll_health 3213 || { echo "ingestd failed"; tail -10 /tmp/ingestd.log; exit 1; }
|
|
|
|
# Build the CSV BEFORE launching queryd so its initial Refresh sees
|
|
# the dataset (the same trick D5 uses).
|
|
cat > "$TMP/workers.csv" <<'EOF'
|
|
id,name,salary,active,weight
|
|
1,Alice,50000,true,165.5
|
|
2,Bob,60000,false,180.0
|
|
3,Carol,55000,true,135.2
|
|
EOF
|
|
|
|
echo "[d6-smoke] launching gateway:"
|
|
./bin/gateway > /tmp/gateway.log 2>&1 &
|
|
PIDS+=($!)
|
|
poll_health 3110 || { echo "gateway failed"; tail -10 /tmp/gateway.log; exit 1; }
|
|
|
|
FAILED=0
|
|
|
|
echo "[d6-smoke] /v1/ingest?name=$NAME (gateway → ingestd):"
|
|
INGEST="$(curl -sS -X POST -F "file=@$TMP/workers.csv" "http://127.0.0.1:3110/v1/ingest?name=$NAME")"
|
|
RC="$(echo "$INGEST" | jq -r '.row_count')"
|
|
PARQUET_KEY="$(echo "$INGEST" | jq -r '.parquet_key')"
|
|
if [ "$RC" = "3" ] && [ "${PARQUET_KEY#datasets/$NAME/}" != "$PARQUET_KEY" ]; then
|
|
echo " ✓ ingest row_count=3, content-addressed key"
|
|
else
|
|
echo " ✗ ingest → $INGEST"; FAILED=1
|
|
fi
|
|
|
|
# Now launch queryd (after the dataset is registered).
|
|
./bin/queryd > /tmp/queryd.log 2>&1 &
|
|
PIDS+=($!)
|
|
poll_health 3214 || { echo "queryd failed"; tail -20 /tmp/queryd.log; exit 1; }
|
|
|
|
echo "[d6-smoke] /v1/catalog/list (gateway → catalogd):"
|
|
CATALOG="$(curl -sS http://127.0.0.1:3110/v1/catalog/list)"
|
|
COUNT="$(echo "$CATALOG" | jq -r '.count')"
|
|
if [ "$COUNT" = "1" ]; then
|
|
echo " ✓ catalog count=1"
|
|
else
|
|
echo " ✗ catalog → $CATALOG"; FAILED=1
|
|
fi
|
|
|
|
echo "[d6-smoke] /v1/storage/list?prefix=datasets/$NAME/ (gateway → storaged):"
|
|
STORAGE="$(curl -sS "http://127.0.0.1:3110/v1/storage/list?prefix=datasets/$NAME/")"
|
|
OBJ_COUNT="$(echo "$STORAGE" | jq -r '.objects | length')"
|
|
if [ "$OBJ_COUNT" -ge "1" ]; then
|
|
echo " ✓ storage list returned $OBJ_COUNT object(s) under datasets/$NAME/"
|
|
else
|
|
echo " ✗ storage list → $STORAGE"; FAILED=1
|
|
fi
|
|
|
|
echo "[d6-smoke] /v1/sql SELECT count(*) (gateway → queryd):"
|
|
SQL_RESP="$(curl -sS -X POST http://127.0.0.1:3110/v1/sql \
|
|
-H 'Content-Type: application/json' \
|
|
-d "{\"sql\":\"SELECT count(*) FROM \\\"$NAME\\\"\"}")"
|
|
N="$(echo "$SQL_RESP" | jq -r '.rows[0][0]')"
|
|
if [ "$N" = "3" ]; then
|
|
echo " ✓ count(*)=3"
|
|
else
|
|
echo " ✗ sql → $SQL_RESP"
|
|
echo " queryd log:"; tail -15 /tmp/queryd.log
|
|
FAILED=1
|
|
fi
|
|
|
|
echo "[d6-smoke] /v1/sql with row data (full round-trip):"
|
|
ROWS_RESP="$(curl -sS -X POST http://127.0.0.1:3110/v1/sql \
|
|
-H 'Content-Type: application/json' \
|
|
-d "{\"sql\":\"SELECT id, name FROM \\\"$NAME\\\" ORDER BY id LIMIT 1\"}")"
|
|
ROW0_NAME="$(echo "$ROWS_RESP" | jq -r '.rows[0][1]')"
|
|
if [ "$ROW0_NAME" = "Alice" ]; then
|
|
echo " ✓ rows[0].name=Alice (full ingest → storage → catalog → query through gateway)"
|
|
else
|
|
echo " ✗ rows → $ROWS_RESP"; FAILED=1
|
|
fi
|
|
|
|
echo "[d6-smoke] /v1/unknown → 404:"
|
|
HTTP="$(curl -sS -o /dev/null -w '%{http_code}' http://127.0.0.1:3110/v1/unknown)"
|
|
if [ "$HTTP" = "404" ]; then
|
|
echo " ✓ unknown route → 404"
|
|
else
|
|
echo " ✗ unknown route → $HTTP"; FAILED=1
|
|
fi
|
|
|
|
# Cleanup smoke artifacts.
|
|
for k in $(curl -sS "http://127.0.0.1:3211/storage/list?prefix=datasets/$NAME/" | jq -r '.objects[]?.Key // empty' 2>/dev/null); do
|
|
curl -sS -o /dev/null -X DELETE "http://127.0.0.1:3211/storage/delete/$k" || true
|
|
done
|
|
curl -sS -o /dev/null -X DELETE "http://127.0.0.1:3211/storage/delete/_catalog/manifests/$NAME.parquet" || true
|
|
|
|
if [ "$FAILED" -eq 0 ]; then
|
|
echo "[d6-smoke] D6 acceptance gate: PASSED"
|
|
exit 0
|
|
else
|
|
echo "[d6-smoke] D6 acceptance gate: FAILED"
|
|
exit 1
|
|
fi
|