golangLAKEHOUSE/justfile
root fb08232f58 Batch 4: embed fixture-mode — partial R-006 closure
Adds cmd/fake_ollama, a minimal Ollama-API-compatible fake that
implements just enough surface for embedd to drive end-to-end
without a real Ollama install:

  GET  /api/tags        — fixed model list including nomic-embed-text
  POST /api/embeddings  — deterministic dim-D vector from sha256(prompt)
  GET  /health          — for the smoke's poll_health helper

Same prompt → bit-identical vector across runs, machines, and CI
nodes. Vectors are NOT semantically meaningful; the fake validates
the embed CONTRACT (dimension echo, response shape, status codes,
deterministic round-trip), not real semantic ranking. Real ranking
still requires real Ollama and lives in scripts/g2_smoke.sh + the
integration tier of the proof harness.

scripts/g2_smoke_fixtures.sh — full chain smoke against the fake:
  - Build fake_ollama + embedd + vectord + gateway
  - Start fake on :11435 (distinct from real Ollama at :11434)
  - Generate temp lakehouse.toml with provider_url override
  - Boot embedd/vectord/gateway with --config <override>
  - 4 assertions: dim=768, deterministic same-text, different-text
    divergence, bad-model → 4xx/5xx (fake 404 → embedd 502)
  - Trap-cleanup tears down all 4 binaries + tmp config

Wired into the task runner:
  just smoke-g2-fixtures

Closes R-006 partially:
  - Embed half: ✓ — CI / fresh-clone reviewers without Ollama can
    now run the embed contract smoke
  - Storage half: deferred — mocking S3 protocol is non-trivial
    (multipart, signed URLs, etc.) and MinIO itself is lightweight
    enough to install via Docker in any CI environment. Documented
    as Sprint 0 follow-up if a CI system without Docker shows up.

What this DOESN'T cover:
  - Real semantic similarity (use scripts/g2_smoke.sh + real Ollama)
  - Real Ollama API quirks (timeouts, version-specific shapes,
    /api/embed batch endpoint that newer versions support)

Verified:
  bash scripts/g2_smoke_fixtures.sh — 4/4 assertions PASS, ~3s wall
  just verify                       — vet + test + 9 smokes still green

Doesn't replace the existing g2_smoke.sh (which still requires real
Ollama and exercises the actual embed semantics). Adds an alternate
mode for portability.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 06:22:07 -05:00

118 lines
4.1 KiB
Makefile

# golangLAKEHOUSE — task runner.
#
# Sprint 0 acceptance gate (R-004): smokes are no longer documentation
# only — `just verify` is the single command that runs vet + tests +
# the 9 smokes. The pre-push hook calls this; CI calls this; reviewers
# call this. One source of truth.
#
# Usage:
# just # alias for `just --list`
# just verify # vet + test + all 9 smokes (full gate)
# just smoke <day> # single smoke (d1..d6, g1, g1p, g2)
# just smoke-all # all 9 smokes only
# just doctor # dependency probe
# just fmt / vet / test / build
# Go lives at /usr/local/go/bin per ADR-001 §1.x; prepend so every
# recipe sees it without depending on the parent shell's PATH.
export PATH := "/usr/local/go/bin:" + env('PATH', '')
# Default recipe shows the menu so `just` alone is a discoverable entry point.
default:
@just --list
# Full Sprint 0 gate: vet + tests + 9 smokes. Pre-push hook calls this.
verify: vet test smoke-all
@echo ""
@echo "[verify] PASS — go vet + go test + 9 smokes all green"
# Static analysis. Runs first so we fail fast on syntax / shape issues.
vet:
@echo "[vet] go vet ./..."
@go vet ./...
# Go unit tests, short mode. Excludes hardware-in-the-loop tags.
test:
@echo "[test] go test -short -count=1 ./..."
@go test -short -count=1 ./...
# Format Go source. Idempotent; CI can run with --check via `just fmt-check`.
fmt:
@gofmt -w cmd internal scripts
# Verify formatting without modifying. Non-zero exit means run `just fmt`.
fmt-check:
@diff -u <(echo -n) <(gofmt -d cmd internal scripts)
# Build every binary into bin/. Mirrors what each smoke does internally.
build:
@echo "[build] go build -o bin/ ./cmd/..."
@go build -o bin/ ./cmd/...
# Single smoke. Day is the suffix before _smoke.sh — d1, d2, …, g2.
smoke day:
@bash scripts/{{day}}_smoke.sh
# Fixture-mode G2 smoke — runs against fake Ollama instead of real,
# so CI / fresh-clone reviewers without Ollama can verify the embed
# contract. Closes R-006 partial (embed half; storage half deferred).
smoke-g2-fixtures:
@bash scripts/g2_smoke_fixtures.sh
# All 9 smokes in dependency order. Halts on first failure.
smoke-all:
#!/usr/bin/env bash
set -euo pipefail
for day in d1 d2 d3 d4 d5 d6 g1 g1p g2; do
printf "[smoke-all] %s ... " "$day"
SECONDS=0
if bash "scripts/${day}_smoke.sh" >/tmp/smoke_${day}.log 2>&1; then
printf "PASS (%ss)\n" "$SECONDS"
else
printf "FAIL (%ss)\n" "$SECONDS"
echo ""
echo " last 20 lines of /tmp/smoke_${day}.log:"
tail -20 "/tmp/smoke_${day}.log" | sed 's/^/ /'
exit 1
fi
done
# Dependency probe. Add --json for machine-readable output.
doctor *args:
@bash scripts/doctor.sh {{args}}
# Proof harness — claims-verification tier above the smoke chain.
# See tests/proof/README.md and docs/TEST_PROOF_SCOPE.md.
# just proof contract fast: APIs + status codes + dim/nonempty
# just proof integration full: CSV→Parquet→SQL, text→vector→search
# just proof performance measurements; runs only after contract+integration
proof mode *flags:
@bash tests/proof/run_proof.sh --mode {{mode}} {{flags}}
# Install pre-push hook so `git push` runs `just verify` first.
install-hooks:
#!/usr/bin/env bash
set -euo pipefail
HOOK=".git/hooks/pre-push"
cat > "$HOOK" <<'HOOK'
#!/usr/bin/env bash
# golangLAKEHOUSE pre-push hook (managed by `just install-hooks`).
# Runs the Sprint 0 gate before letting commits leave this machine.
set -e
cd "$(git rev-parse --show-toplevel)"
echo "[pre-push] running just verify ..."
if ! just verify; then
echo ""
echo "[pre-push] FAIL — push aborted. Fix the gate or use --no-verify (NOT recommended)."
exit 1
fi
HOOK
chmod +x "$HOOK"
echo "[install-hooks] $HOOK installed and executable"
# Clean built binaries + smoke logs. Does NOT touch reports/ or data/.
clean:
@rm -rf bin/
@rm -f /tmp/smoke_*.log
@echo "[clean] bin/ removed, smoke logs cleared"