local-review-harness

Author	SHA1	Message	Date
Claude (review-harness setup)	a75e14716b	Phase E — append-only memory + diff subcommand (PROMPT.md complete) Closes the harness's feature set per PROMPT.md modes 2 (Diff Review) and Phase 5 (Memory). Rules subcommand still pending (it needs operator-authored .review-rules.md content first; documented as Phase E follow-up). internal/memory/ — append-only writer: - AppendKnownRisks: one JSONL line per confirmed finding per run. O_APPEND only; never O_TRUNC. Empty findings list is a no-op (doesn't even create the file — keeps clean runs from polluting .memory/). - AppendRunHistory: one JSONL line per run. Run summary stats + receipts hash for cross-link. - WriteProjectProfile: the ONLY non-versioned memory file; snapshot semantics, overwrites are explicit + documented. - 4 unit tests including TestAppendKnownRisks_NeverTruncates which is the audit's "no silent overwrite" gate — write twice, assert both writes' content survives. Pipeline phase 5 wires it. Confirmed findings only — suspected findings might still be wrong, keeping .memory/ authoritative. Disabled if review-profile.memory.enabled = false. internal/git/git.go — ChangedFiles helper: - Probes unstaged + staged + branch diff against main/master. - Dedup'd, stable order. Empty result on clean tree. - Graceful failure: returns error if git binary missing or target isn't a git repo. cli/repo.go — Diff subcommand: - `review-harness diff <path>` runs the same pipeline as scrum but scoped to changed files only. Pipeline.Inputs gains DiffOnlyFiles filter applied post-Walk. - Empty diff (clean tree, no commits ahead of base) → exit 0 with message; doesn't generate empty reports. - LLM toggleable via --enable-llm same as scrum. scanner/walk.go: added .memory to SkipDirs (universal — harness's own audit trail, scanning it surfaces planted-secret evidence as new findings — same class as B5 self-skip). .gitignore tightened: /.memory/ → **/.memory/ to keep test-fixture .memory dirs from leaking into version control (same fix as reports/latest pattern). Verified end-to-end: - 4 memory unit tests PASS - Append-only proven: insecure-repo run 1 → 16 known-risks lines; run 2 → 44 lines (16 + 28 from new run); run-history grew 1 → 2. - Diff subcommand against this repo (5 uncommitted Phase E files staged) → exit 0, all reports produced, scoped to those 5 files only (0 findings on the diff-scoped scan vs 129 on full repo — changed files don't contain analyzer-flaggable patterns). Phase A through E shipped today. Rules subcommand + tests for internal/{config,scanner,git,llm,reporters,pipeline} remain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 02:19:12 -05:00
Claude (review-harness setup)	2fc047487f	Scanner respects .gitignore — full Lakehouse Rust scan now possible Single biggest unblock for using the harness on real targets. The lakehouse Rust repo has a 67GB data/ directory holding parquet, JSONL pathway memory, headshots, and other runtime data — all gitignored. Pre-fix the scanner walked it all (and stalled). Post- fix the full Rust scan completes in 15s. internal/scanner/gitignore.go — minimal Matcher that handles the patterns real .gitignore files use ~99% of the time: - basename match anywhere (`pattern`) - dir-only match (`pattern/`) - root-anchored (`/pattern`) - path-anchored (`pattern/sub` — interior slash) - extension globs (`.ext`) - path + extension (`path/.ext`) - comments + blank lines ignored Negations (!pattern) intentionally NOT supported v0; matcher records HasNegations() so callers can surface a warning if encountered. internal/scanner/gitignore_test.go — 14 cases against a synthetic .gitignore covering all 6 pattern shapes, plus missing-file and negation-recording tests. walk.go integration: gitignore loaded once at scan start; checked in the dir-skip branch (SkipDir cascades) and the file-emit branch. Skip layers in order: universal-noise basenames → .gitignore → path-scoped self-skip → dotfile filter. Verified end-to-end: - lakehouse Rust full repo: 15s scan, 1031 findings, 0 critical (no committed secrets in source — independently confirms what scrum2 + the Rust auditor said) - 529 hardcoded-path findings IS the Sprint 4 gap the audit kept naming; the harness just put a number on it This was Opus's WARN B5 from the cross-lineage scrum, plus the "harness stalls on real repos" gap exposed when running it against the actual Lakehouse repos. Both addressed in one wave. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 02:14:10 -05:00
Claude (review-harness setup)	ab550a7c5a	Apply B5 from 2026-04-30 scrum — scanner skip-list scoped to harness self Opus-only BLOCK from the cross-lineage scrum: pre-fix SkipDirs basename-matched bin/build/dist/target/reports for ANY repo, silently excluding legitimate source dirs on real targets. The lakehouse Rust repo has reports/ holding markdown; some Java/ Python/Go projects use bin/ as a source dir; target/ is project- specific. Skipping them globally produced silent false-negative scans the operator would never know about. Fix: trim SkipDirs to dirs that are universally not source code — .git, .hg, .svn (VCS metadata); node_modules, vendor (dep caches); __pycache__, .venv, venv (Python envs); .idea, .vscode (editor state). Removed: bin, build, dist, target, reports. For the harness's own self-skip (it shouldn't scan its own bin/ or reports/), added path-scoped skip via selfSkipsFor — detects "this is the harness repo" by the presence of BOTH cmd/review-harness/ AND internal/analyzers/ subdirs (combination unique to this codebase), then skips the absolute paths bin/ and reports/ for that scan only. Two regression tests: - TestWalk_DoesNotSkipBinReportsInTargetRepo plants files under bin/, reports/, build/, dist/, target/ in a synthetic target repo; asserts all 5 appear in scan, while .git/ + node_modules/ + vendor/ are still skipped. - TestWalk_SelfSkipsBinReportsInHarnessRepo plants the harness's marker dirs (cmd/review-harness/, internal/analyzers/) plus bin/ + reports/ + ordinary src/; asserts self-skip fires on bin/+reports/ but real src/ scans normally. Compiled artifacts inside bin/ are filtered by the analyzers' isTextLike extension check (.exe / .dll / .so), so target repos with bin/ holding compiled output don't waste cycles decoding it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 01:34:45 -05:00
Claude (review-harness setup)	f3ee4722a8	Phase A + B (MVP) — local review harness Implements the MVP cutline from the planning artifact: - Phase A: skeleton + CLI dispatch + provider interface + stub model doctor - Phase B: scanner + git probe + 12 static analyzers + reporters + pipeline - Phase B fixtures: clean-repo, insecure-repo, degraded-repo 12 static analyzers per PROMPT.md "Suggested Static Checks For MVP": hardcoded_paths, shell_execution, raw_sql_interpolation, broad_cors, secret_patterns, large_files, todo_comments, missing_tests, env_file_committed, unsafe_file_io, exposed_mutation_endpoint, hardcoded_local_ip. Acceptance gates passing: - B1 (intake produces accurate counts) ✓ - B2 (insecure fixture fires ≥8 distinct check_ids — actually 11/12) ✓ - B3 (clean fixture produces 0 confirmed findings — no false positives) ✓ - B4 (scrum mode produces all 6 required markdown + JSON reports) ✓ - B5 (receipts.json marks degraded phases honestly) ✓ - F (self-review on this repo runs without crashing) ✓ — exit 66 (degraded because Phase C LLM review is hardcoded skipped) Phases C (LLM review), D (validation cross-check), E (memory + diff + rules subcommands) deferred per the cutline. The MVP delivers the evidence-first path; LLM is purely additive. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 00:56:02 -05:00

4 Commits