S0.3: just verify + pre-push hook gates the smoke chain

Sprint 0 / R-004 / GATE-0.4 — the 9-smoke chain is no longer
documentation only. One command (`just verify`) runs vet + tests +
all 9 smokes; pre-push hook calls it; a regression cannot leave
this machine without explicit --no-verify override.

Recipes:
  just verify          full gate (33s wall on this box)
  just smoke <day>     single smoke (d1..d6, g1, g1p, g2)
  just smoke-all       all 9 smokes only
  just doctor          dep probe with structured output
                       (--json for CI / pre-push)
  just install-hooks   install .git/hooks/pre-push
  just fmt|vet|test|build|clean

scripts/doctor.sh probes Go ≥1.25, gcc, MinIO at :9000 with bucket
lakehouse-go-primary, Ollama at :11434 with nomic-embed-text loaded,
/etc/lakehouse/secrets-go.toml with [s3.primary]. Each missing dep
prints its install fix command. JSON mode emits the same shape for
CI / pre-push consumers.

README updated with the task-runner section + just install-hooks
on cold-start. Hooks live in .git/hooks/ (untracked); install
recipe recreates them on a fresh clone.

PATH note: justfile prepends /usr/local/go/bin so recipes find Go
without depending on the parent shell's PATH (ADR-001 §1.x lives
go there).

Verified: just verify exits 0 in 33s wall (vet ~0.1s + test ~0.1s +
9 smokes deterministic per audit baseline). Pre-push hook installed
and bash -n clean.

Closes audit risk R-004 (smokes not gated).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
root 2026-04-29 04:56:50 -05:00
parent 91edd43164
commit e31638204d
3 changed files with 270 additions and 2 deletions

View File

@ -53,20 +53,38 @@ scripts/g1p_smoke.sh # vectord state survives kill+restart via storaged
scripts/g2_smoke.sh # embed → vectord add → search round-trip
```
Run them all in any order:
Or run the full gate via the task runner (see below):
```
for s in scripts/{d1,d2,d3,d4,d5,d6,g1,g1p,g2}_smoke.sh; do "$s" || break; done
just verify # vet + tests + 9 smokes; ~33s wall
```
## Task runner
```
just # show available recipes
just verify # full Sprint 0 gate (vet + tests + 9 smokes)
just smoke <day> # single smoke (d1..d6, g1, g1p, g2)
just doctor # check cold-start deps; --json for CI
just install-hooks # install pre-push hook that runs just verify
```
After a fresh clone, run `just install-hooks` once so `git push` is
gated on the same green chain that ran here. Hook lives in
`.git/hooks/pre-push` (not tracked; recreated by the recipe).
## Cold-start dependencies
- Go 1.25+ at `/usr/local/go/bin` (arrow-go pulled the 1.25 floor)
- `gcc` + `libc-dev` for the DuckDB cgo binding (ADR-001 §1.1)
- `just` task runner (`apt install just` on Debian 13+)
- MinIO running on `:9000` with bucket `lakehouse-go-primary`
- Ollama running on `:11434` with `nomic-embed-text` loaded (G2)
- `/etc/lakehouse/secrets-go.toml` with `[s3.primary]` credentials
(storaged + queryd both read this)
`just doctor` probes all of the above and reports the fix command
for each missing dep. CI / scripts can use `just doctor --json`.
## Layout
```

103
justfile Normal file
View File

@ -0,0 +1,103 @@
# golangLAKEHOUSE — task runner.
#
# Sprint 0 acceptance gate (R-004): smokes are no longer documentation
# only — `just verify` is the single command that runs vet + tests +
# the 9 smokes. The pre-push hook calls this; CI calls this; reviewers
# call this. One source of truth.
#
# Usage:
# just # alias for `just --list`
# just verify # vet + test + all 9 smokes (full gate)
# just smoke <day> # single smoke (d1..d6, g1, g1p, g2)
# just smoke-all # all 9 smokes only
# just doctor # dependency probe
# just fmt / vet / test / build
# Go lives at /usr/local/go/bin per ADR-001 §1.x; prepend so every
# recipe sees it without depending on the parent shell's PATH.
export PATH := "/usr/local/go/bin:" + env('PATH', '')
# Default recipe shows the menu so `just` alone is a discoverable entry point.
default:
@just --list
# Full Sprint 0 gate: vet + tests + 9 smokes. Pre-push hook calls this.
verify: vet test smoke-all
@echo ""
@echo "[verify] PASS — go vet + go test + 9 smokes all green"
# Static analysis. Runs first so we fail fast on syntax / shape issues.
vet:
@echo "[vet] go vet ./..."
@go vet ./...
# Go unit tests, short mode. Excludes hardware-in-the-loop tags.
test:
@echo "[test] go test -short -count=1 ./..."
@go test -short -count=1 ./...
# Format Go source. Idempotent; CI can run with --check via `just fmt-check`.
fmt:
@gofmt -w cmd internal scripts
# Verify formatting without modifying. Non-zero exit means run `just fmt`.
fmt-check:
@diff -u <(echo -n) <(gofmt -d cmd internal scripts)
# Build every binary into bin/. Mirrors what each smoke does internally.
build:
@echo "[build] go build -o bin/ ./cmd/..."
@go build -o bin/ ./cmd/...
# Single smoke. Day is the suffix before _smoke.sh — d1, d2, …, g2.
smoke day:
@bash scripts/{{day}}_smoke.sh
# All 9 smokes in dependency order. Halts on first failure.
smoke-all:
#!/usr/bin/env bash
set -euo pipefail
for day in d1 d2 d3 d4 d5 d6 g1 g1p g2; do
printf "[smoke-all] %s ... " "$day"
SECONDS=0
if bash "scripts/${day}_smoke.sh" >/tmp/smoke_${day}.log 2>&1; then
printf "PASS (%ss)\n" "$SECONDS"
else
printf "FAIL (%ss)\n" "$SECONDS"
echo ""
echo " last 20 lines of /tmp/smoke_${day}.log:"
tail -20 "/tmp/smoke_${day}.log" | sed 's/^/ /'
exit 1
fi
done
# Dependency probe. Add --json for machine-readable output.
doctor *args:
@bash scripts/doctor.sh {{args}}
# Install pre-push hook so `git push` runs `just verify` first.
install-hooks:
#!/usr/bin/env bash
set -euo pipefail
HOOK=".git/hooks/pre-push"
cat > "$HOOK" <<'HOOK'
#!/usr/bin/env bash
# golangLAKEHOUSE pre-push hook (managed by `just install-hooks`).
# Runs the Sprint 0 gate before letting commits leave this machine.
set -e
cd "$(git rev-parse --show-toplevel)"
echo "[pre-push] running just verify ..."
if ! just verify; then
echo ""
echo "[pre-push] FAIL — push aborted. Fix the gate or use --no-verify (NOT recommended)."
exit 1
fi
HOOK
chmod +x "$HOOK"
echo "[install-hooks] $HOOK installed and executable"
# Clean built binaries + smoke logs. Does NOT touch reports/ or data/.
clean:
@rm -rf bin/
@rm -f /tmp/smoke_*.log
@echo "[clean] bin/ removed, smoke logs cleared"

147
scripts/doctor.sh Executable file
View File

@ -0,0 +1,147 @@
#!/usr/bin/env bash
# Dependency probe for golangLAKEHOUSE.
# Sprint 0 / S0.1 — surfaces every cold-start dep as a structured
# checklist. With --json, emits machine-readable shape for CI.
#
# Exit 0 = all green. Exit 1 = at least one missing dep.
set -uo pipefail
# Mode: text (default) or json
JSON=0
for arg in "$@"; do
case "$arg" in
--json) JSON=1 ;;
-h|--help)
echo "Usage: $0 [--json]"
echo " Probes Go, gcc, MinIO, Ollama, secrets-go.toml."
echo " Default output is human-readable; --json emits structured findings."
exit 0 ;;
esac
done
# Findings accumulator. Each entry: <name>|<status>|<detail>|<fix>
# status ∈ {ok, missing, wrong-version, unreachable}
findings=()
probe() {
findings+=("$1|$2|$3|$4")
}
# 1. Go ≥1.25 (arrow-go pulled the floor up — see ADR-001 §1.x)
if go_path="$(command -v go 2>/dev/null)"; then
go_ver="$(go version 2>/dev/null | awk '{print $3}' | sed 's/^go//')"
case "$go_ver" in
1.25*|1.26*|1.27*) probe "go" "ok" "$go_ver at $go_path" "" ;;
*) probe "go" "wrong-version" "$go_ver at $go_path (need ≥1.25)" \
"curl -L https://go.dev/dl/go1.25.0.linux-amd64.tar.gz | sudo tar -C /usr/local -xz" ;;
esac
else
probe "go" "missing" "not in PATH" \
"curl -L https://go.dev/dl/go1.25.0.linux-amd64.tar.gz | sudo tar -C /usr/local -xz && export PATH=\$PATH:/usr/local/go/bin"
fi
# 2. gcc (DuckDB cgo binding per ADR-001 §1.1)
if gcc_path="$(command -v gcc 2>/dev/null)"; then
gcc_ver="$(gcc --version 2>/dev/null | head -1 | awk '{print $NF}')"
probe "gcc" "ok" "$gcc_ver at $gcc_path" ""
else
probe "gcc" "missing" "not in PATH" "sudo apt install -y build-essential"
fi
# 3. MinIO at :9000 with bucket lakehouse-go-primary
if curl -sf --max-time 2 http://localhost:9000/minio/health/live >/dev/null 2>&1; then
# bucket existence — use mc if available, else fall back to noting it
if command -v mc >/dev/null 2>&1; then
if mc ls local/lakehouse-go-primary >/dev/null 2>&1; then
probe "minio" "ok" "live at :9000, bucket lakehouse-go-primary present" ""
else
probe "minio" "missing" "live at :9000 but bucket lakehouse-go-primary absent" \
"mc mb local/lakehouse-go-primary"
fi
else
probe "minio" "ok" "live at :9000 (bucket presence not verified — install mc to check)" ""
fi
else
probe "minio" "unreachable" "no /minio/health/live response on :9000" \
"sudo systemctl start minio # or restart"
fi
# 4. Ollama at :11434 with nomic-embed-text loaded (G2 default model)
if ollama_resp="$(curl -sf --max-time 3 http://localhost:11434/api/tags 2>/dev/null)"; then
if echo "$ollama_resp" | grep -q '"name":"nomic-embed-text:latest"'; then
probe "ollama" "ok" "live at :11434, nomic-embed-text loaded" ""
else
probe "ollama" "missing" "live at :11434 but nomic-embed-text not loaded" \
"ollama pull nomic-embed-text"
fi
else
probe "ollama" "unreachable" "no /api/tags response on :11434" \
"sudo systemctl start ollama"
fi
# 5. /etc/lakehouse/secrets-go.toml
if [ -f /etc/lakehouse/secrets-go.toml ]; then
if [ -r /etc/lakehouse/secrets-go.toml ]; then
if grep -q '\[s3.primary\]' /etc/lakehouse/secrets-go.toml 2>/dev/null; then
probe "secrets" "ok" "/etc/lakehouse/secrets-go.toml present, contains [s3.primary]" ""
else
probe "secrets" "missing" "/etc/lakehouse/secrets-go.toml missing [s3.primary] section" \
"edit /etc/lakehouse/secrets-go.toml to add [s3.primary] with access_key_id + secret_access_key"
fi
else
probe "secrets" "missing" "/etc/lakehouse/secrets-go.toml exists but unreadable by current user" \
"sudo chmod 0644 /etc/lakehouse/secrets-go.toml # or run as the user that can read it"
fi
else
probe "secrets" "missing" "/etc/lakehouse/secrets-go.toml not present" \
"sudo install -m 0644 /dev/stdin /etc/lakehouse/secrets-go.toml < secrets-go.toml.example"
fi
# Summarize
exit_code=0
for f in "${findings[@]}"; do
case "$(echo "$f" | cut -d'|' -f2)" in
ok) ;;
*) exit_code=1 ;;
esac
done
if [ "$JSON" -eq 1 ]; then
printf '{\n'
printf ' "deps": [\n'
last=$((${#findings[@]} - 1))
for i in "${!findings[@]}"; do
IFS='|' read -r name status detail fix <<< "${findings[$i]}"
printf ' {"name":"%s","status":"%s","detail":"%s","fix":"%s"}' \
"$name" "$status" \
"$(echo "$detail" | sed 's/"/\\"/g')" \
"$(echo "$fix" | sed 's/"/\\"/g')"
[ "$i" -lt "$last" ] && printf ','
printf '\n'
done
printf ' ],\n'
printf ' "ok": %s\n' "$([ $exit_code -eq 0 ] && echo true || echo false)"
printf '}\n'
else
echo "[doctor] dependency probe:"
for f in "${findings[@]}"; do
IFS='|' read -r name status detail fix <<< "$f"
case "$status" in
ok) printf " ✓ %-7s %s\n" "$name" "$detail" ;;
missing) printf " ✗ %-7s %s\n" "$name" "$detail"
[ -n "$fix" ] && printf " fix: %s\n" "$fix" ;;
wrong-version) printf " ⚠ %-7s %s\n" "$name" "$detail"
[ -n "$fix" ] && printf " fix: %s\n" "$fix" ;;
unreachable) printf " ✗ %-7s %s\n" "$name" "$detail"
[ -n "$fix" ] && printf " fix: %s\n" "$fix" ;;
esac
done
if [ "$exit_code" -eq 0 ]; then
echo "[doctor] all dependencies green"
else
echo "[doctor] one or more dependencies need attention"
fi
fi
exit "$exit_code"