Implements PROMPT.md / docs/REVIEW_PIPELINE.md Phase 2:
- internal/llm/ollama.go — real Ollama provider:
- HealthCheck probes /api/tags + a 1-token completion + a JSON-mode
probe ({"ok": true} round-trip), populating the model-doctor.json
schema documented in docs/LOCAL_MODEL_SETUP.md
- Complete + CompleteJSON via /api/chat with stream=false
- think=false set for ALL completions (qwen3.5:latest is reasoning-
capable but the inner-loop hot path wants direct answers, not
reasoning traces consuming the token budget — same finding as
the Lakehouse-Go chatd 2026-04-30 wave)
- internal/llm/review.go — Reviewer wrapper:
- 2-attempt flow: prompt → parse → repair-prompt → parse
- Strict JSON shape enforced; markdown fences stripped before parse
- Severity normalized to enum; out-of-range confidence clamped
- Per-file chunking (file-level for v0; function-level Phase D+)
- Bounded by review-profile max_file_bytes + max_llm_chunk_chars
- pipeline.go — Phase 2 wired between static scan + report gen:
- --enable-llm flag opts in (off by default — static-only is
cheaper and faster)
- Raw output ALWAYS saved to llm-findings.raw.json (forensics)
- Normalized findings → llm-findings.normalized.json
- LLM findings merged into the report findings list (sourced
"llm" so consumers can filter)
- Receipts honestly mark phase status: "ok" | "degraded" | "skipped"
- cli model doctor — real probes replace the Phase A stub.
Verified:
- model doctor: status="ok" with qwen3.5:latest + qwen3:latest both
loaded, basic_prompt_ok=true, json_mode_ok=true
- insecure-repo with --enable-llm: 9 LLM findings; qwen3.5 correctly
flagged SQLi, RCE, hardcoded credentials as critical with verbatim
evidence; 27s wall for 3 chunks
- clean-repo with --enable-llm: 0 LLM findings, 4 parsed chunks, 2.8s
- self-review with --enable-llm: 77 LLM findings + 83 static; 3 of
~30 chunks needed retry (PROMPT.md, REPORT_SCHEMA.md,
SCRUM_TEST_TEMPLATE.md — all eventually parsed); 5min wall
go vet + go test -short clean. Fixture stray.go now `package fixture`
so go-tooling doesn't choke on the orphan.
Phase D (validator cross-check) + Phase E (memory + diff/rules
subcommands) remain.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
local-review-harness
Local-first code review harness. Walks a repository, runs evidence-bearing static checks, generates Scrum-style reports. No cloud dependencies. LLM review is local-Ollama-only (Phase C, not yet shipped).
Per FIRST_COMMAND_FOR_CLAUDE_CODE.md + PROMPT.md — "AI may suggest. Code validates. Reports must show evidence." Findings without grep-able evidence get rejected; the validator phase rejects model claims that cite missing files.
Status
Phase A + Phase B (MVP) shipped. What works today:
review-harness repo <path>— Phase 0 intake + Phase 1 static scanreview-harness scrum <path>— same pipeline + full Scrum report bundlereview-harness model doctor— stub (real Ollama probe in Phase C)- 12 static analyzers covering hardcoded paths, shell exec, raw SQL, wildcard CORS, secret patterns, large files, TODO/FIXME, missing tests, committed
.env, unsafe file I/O, exposed mutation endpoints, hardcoded private-network IPs
Phases C–E pending: real LLM review, validation cross-check, append-only memory, diff/rules subcommands.
Build
Single static binary, no cgo:
go build -o review-harness ./cmd/review-harness
Requires Go 1.22+.
Run
# Full repo review (Phase 0 + Phase 1 + Phase 4)
./review-harness repo /path/to/target/repo
# Same + Scrum bundle (scrum-test.md, risk-register.md, sprint-backlog.md, acceptance-gates.md)
./review-harness scrum /path/to/target/repo
# Model doctor stub
./review-harness model doctor
Reports land in <target>/reports/latest/ by default; override with --output-dir.
Optional config files:
./review-harness scrum /path --review-profile configs/review-profile.example.yaml \
--model-profile configs/model-profile.example.yaml
Self-review
The harness reviews itself as a sanity gate (PROMPT.md "Final Deliverable"):
./review-harness scrum .
cat reports/latest/scrum-test.md
The fixture-planted secrets in tests/fixtures/insecure-repo/ are intentional — they prove the secret-pattern analyzer fires. Operators reviewing the self-report should expect those critical-severity hits and dismiss them as fixture content.
Test fixtures
Three synthetic repos under tests/fixtures/:
| Fixture | Purpose | Expected outcome |
|---|---|---|
clean-repo/ |
sterile reference | 0 confirmed findings |
insecure-repo/ |
every static check fires | ≥8 distinct check IDs |
degraded-repo/ |
no git, no manifests | repo_intake phase marked degraded |
Run them all to validate after a regex change:
for f in clean-repo insecure-repo degraded-repo; do
./review-harness scrum "tests/fixtures/$f" > /dev/null
echo "$f: $(jq '.summary.total' tests/fixtures/$f/reports/latest/static-findings.json) findings"
done
Exit codes
0— clean run, no degraded phases64— usage error65— runtime error (config parse fail, target path missing, etc.)66— degraded mode (one or more phases skipped or stubbed; reports still produced)
66 is the expected exit code in MVP because the LLM phase is hardcoded degraded until Phase C lands.