local-review-harness/docs/REVIEW_PIPELINE.md
Claude (review-harness setup) f3ee4722a8 Phase A + B (MVP) — local review harness
Implements the MVP cutline from the planning artifact:
- Phase A: skeleton + CLI dispatch + provider interface + stub model doctor
- Phase B: scanner + git probe + 12 static analyzers + reporters + pipeline
- Phase B fixtures: clean-repo, insecure-repo, degraded-repo

12 static analyzers per PROMPT.md "Suggested Static Checks For MVP":
hardcoded_paths, shell_execution, raw_sql_interpolation, broad_cors,
secret_patterns, large_files, todo_comments, missing_tests,
env_file_committed, unsafe_file_io, exposed_mutation_endpoint,
hardcoded_local_ip.

Acceptance gates passing:
- B1 (intake produces accurate counts) ✓
- B2 (insecure fixture fires ≥8 distinct check_ids — actually 11/12) ✓
- B3 (clean fixture produces 0 confirmed findings — no false positives) ✓
- B4 (scrum mode produces all 6 required markdown + JSON reports) ✓
- B5 (receipts.json marks degraded phases honestly) ✓
- F  (self-review on this repo runs without crashing) ✓ — exit 66 (degraded
  because Phase C LLM review is hardcoded skipped)

Phases C (LLM review), D (validation cross-check), E (memory + diff +
rules subcommands) deferred per the cutline. The MVP delivers the
evidence-first path; LLM is purely additive.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 00:56:02 -05:00

187 lines
3.3 KiB
Markdown
Executable File

# Review Pipeline Specification
## Purpose
This document defines the local review harness pipeline.
The pipeline exists to inspect a repository, collect evidence, identify risks, validate model claims, and generate operational reports without relying on cloud services.
## Pipeline Overview
```text
Repo Intake
-> Static Scan
-> Optional LLM Review
-> Validation
-> Report Generation
-> Memory Update
```
## Phase 0: Repo Intake
### Goal
Build a factual profile of the target repository.
### Inputs
- repository path
- git metadata
- filesystem metadata
- dependency manifests
- build files
- test files
### Required Output
```text
reports/latest/repo-intake.json
```
### Required Fields
```json
{
"repo_path": "",
"current_branch": "",
"latest_commit": "",
"git_status": "",
"file_count": 0,
"language_breakdown": {},
"largest_files": [],
"dependency_manifests": [],
"test_manifests": [],
"generated_at": ""
}
```
## Phase 1: Static Scan
### Goal
Find evidence-backed problems without using an LLM.
### Detection Targets
- hardcoded absolute paths
- unsafe shell execution
- raw SQL interpolation
- exposed mutation endpoints
- broad CORS
- unchecked file reads and writes
- suspicious secret patterns
- large files
- TODO, FIXME, HACK comments
- missing tests near critical modules
### Required Output
```text
reports/latest/static-findings.json
```
## Phase 2: LLM Review
### Goal
Use a local model to perform higher-level reasoning over bounded evidence chunks.
### Rules
- Do not send the entire repository blindly.
- Chunk inputs by file, function, or diff boundary.
- Require strict JSON output.
- Retry invalid JSON once.
- Save degraded output if parsing fails.
- Never trust model claims without validation.
### Required Output
```text
reports/latest/llm-findings.raw.json
reports/latest/llm-findings.normalized.json
```
## Phase 3: Validation
### Goal
Validate every LLM-generated finding against real repository evidence.
### Reject A Finding If
- the file does not exist
- the cited evidence does not exist
- the line hint is impossible
- the claim is unsupported
- the suggested fix targets unrelated code
- the model invents tests, commands, or files
### Required Output
```text
reports/latest/validated-findings.json
```
## Phase 4: Report Generation
### Goal
Produce human-readable and machine-readable reports.
### Required Markdown Reports
```text
reports/latest/scrum-test.md
reports/latest/risk-register.md
reports/latest/claim-coverage-table.md
reports/latest/sprint-backlog.md
reports/latest/acceptance-gates.md
```
### Required JSON Receipt
```text
reports/latest/receipts.json
```
## Phase 5: Memory
### Goal
Persist durable review knowledge for future runs.
### Required Memory Files
```text
.memory/review-rules.md
.memory/known-risks.json
.memory/fixed-patterns.json
.memory/project-profile.json
```
### Memory Rules
- append-only by default
- version every update
- never silently overwrite
- record source run ID
- record evidence file
- record confidence level
## Degraded Mode
A phase is degraded when it cannot fully run but the pipeline can continue.
Examples:
- Ollama unavailable
- model returns invalid JSON
- repository has no git metadata
- dependency manager unavailable
- large dataset missing
Degraded mode must be explicit in reports.
No silent success.