Claude (review-harness setup) f3ee4722a8 Phase A + B (MVP) — local review harness

Implements the MVP cutline from the planning artifact:
- Phase A: skeleton + CLI dispatch + provider interface + stub model doctor
- Phase B: scanner + git probe + 12 static analyzers + reporters + pipeline
- Phase B fixtures: clean-repo, insecure-repo, degraded-repo

12 static analyzers per PROMPT.md "Suggested Static Checks For MVP":
hardcoded_paths, shell_execution, raw_sql_interpolation, broad_cors,
secret_patterns, large_files, todo_comments, missing_tests,
env_file_committed, unsafe_file_io, exposed_mutation_endpoint,
hardcoded_local_ip.

Acceptance gates passing:
- B1 (intake produces accurate counts) ✓
- B2 (insecure fixture fires ≥8 distinct check_ids — actually 11/12) ✓
- B3 (clean fixture produces 0 confirmed findings — no false positives) ✓
- B4 (scrum mode produces all 6 required markdown + JSON reports) ✓
- B5 (receipts.json marks degraded phases honestly) ✓
- F  (self-review on this repo runs without crashing) ✓ — exit 66 (degraded
  because Phase C LLM review is hardcoded skipped)

Phases C (LLM review), D (validation cross-check), E (memory + diff +
rules subcommands) deferred per the cutline. The MVP delivers the
evidence-first path; LLM is purely additive.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-30 00:56:02 -05:00

3.1 KiB

Raw Blame History

local-review-harness

Local-first code review harness. Walks a repository, runs evidence-bearing static checks, generates Scrum-style reports. No cloud dependencies. LLM review is local-Ollama-only (Phase C, not yet shipped).

Per FIRST_COMMAND_FOR_CLAUDE_CODE.md + PROMPT.md — "AI may suggest. Code validates. Reports must show evidence." Findings without grep-able evidence get rejected; the validator phase rejects model claims that cite missing files.

Status

Phase A + Phase B (MVP) shipped. What works today:

review-harness repo <path> — Phase 0 intake + Phase 1 static scan
review-harness scrum <path> — same pipeline + full Scrum report bundle
review-harness model doctor — stub (real Ollama probe in Phase C)
12 static analyzers covering hardcoded paths, shell exec, raw SQL, wildcard CORS, secret patterns, large files, TODO/FIXME, missing tests, committed .env, unsafe file I/O, exposed mutation endpoints, hardcoded private-network IPs

Phases C–E pending: real LLM review, validation cross-check, append-only memory, diff/rules subcommands.

Build

Single static binary, no cgo:

go build -o review-harness ./cmd/review-harness

Requires Go 1.22+.

Run

# Full repo review (Phase 0 + Phase 1 + Phase 4)
./review-harness repo /path/to/target/repo

# Same + Scrum bundle (scrum-test.md, risk-register.md, sprint-backlog.md, acceptance-gates.md)
./review-harness scrum /path/to/target/repo

# Model doctor stub
./review-harness model doctor

Reports land in <target>/reports/latest/ by default; override with --output-dir.

Optional config files:

./review-harness scrum /path --review-profile configs/review-profile.example.yaml \
                              --model-profile  configs/model-profile.example.yaml

Self-review

The harness reviews itself as a sanity gate (PROMPT.md "Final Deliverable"):

./review-harness scrum .
cat reports/latest/scrum-test.md

The fixture-planted secrets in tests/fixtures/insecure-repo/ are intentional — they prove the secret-pattern analyzer fires. Operators reviewing the self-report should expect those critical-severity hits and dismiss them as fixture content.

Test fixtures

Three synthetic repos under tests/fixtures/:

Fixture	Purpose	Expected outcome
`clean-repo/`	sterile reference	0 confirmed findings
`insecure-repo/`	every static check fires	≥8 distinct check IDs
`degraded-repo/`	no git, no manifests	`repo_intake` phase marked degraded

Run them all to validate after a regex change:

for f in clean-repo insecure-repo degraded-repo; do
  ./review-harness scrum "tests/fixtures/$f" > /dev/null
  echo "$f: $(jq '.summary.total' tests/fixtures/$f/reports/latest/static-findings.json) findings"
done

Exit codes

0 — clean run, no degraded phases
64 — usage error
65 — runtime error (config parse fail, target path missing, etc.)
66 — degraded mode (one or more phases skipped or stubbed; reports still produced)

66 is the expected exit code in MVP because the LLM phase is hardcoded degraded until Phase C lands.

3.1 KiB Raw Blame History Unescape Escape