local-review-harness/PROMPT.md

# Claude Code Prompt: Build Local AI Code Review Harness

## Mission

Create a local-first autonomous code review harness inspired by PR-Agent, Gito, OpenReview, Kodus, and Sourcery, but built around our own tools, local models, and validation-first workflow.

This is not a SaaS PR bot.

This is a local DevOps review system that can inspect a repository, summarize risk, identify architectural drift, detect unsafe code patterns, produce Scrum-style backlog reports, and optionally route review tasks through local LLMs using Ollama or another local model endpoint.

## Core Principle

AI may suggest.

Code validates.

Reports must show evidence.

Nothing is trusted because a model said it.

## Target Use Case

Given a repository path, the system should run a review pipeline that produces:

- architecture overview
- code health report
- security and trust-boundary report
- test coverage gap report
- refactor recommendations
- Scrum sprint backlog
- acceptance gates
- machine-readable JSON receipts

## Inspired Features To Extract

### From PR-Agent

Implement:

- PR and diff-style review mode
- summary of changed files
- risk-ranked findings
- suggested review comments
- checklist output
- confidence score per finding

Do not copy implementation. Recreate the concept locally.

### From Gito

Implement:

- local model compatibility
- full-repo review mode
- model-provider abstraction
- ability to run without GitHub or SaaS
- config-driven review profiles

### From OpenReview

Implement:

- webhook-ready design later
- clean separation between:
  - repo scanner
  - diff analyzer
  - LLM reviewer
  - report generator
  - validation layer

For now, local CLI first.

### From Kodus

Implement:

- plain-language project rules
- repo-specific review policy file
- ability to enforce local conventions
- persistent team memory rules

Example file:

```text
.review-rules.md
```

### From Sourcery

Implement:

- low-level refactor suggestions
- duplicated logic detection
- complexity hotspots
- dead code suspicion
- long-file warnings
- unsafe error handling warnings

## Architecture

Create a modular system with this shape:

```text
local-review-harness/
  configs/
    review-profile.example.yaml
    model-profile.example.yaml
  docs/
    REVIEW_PIPELINE.md
    LOCAL_MODEL_SETUP.md
    REPORT_SCHEMA.md
  src/
    cli/
    scanner/
    git/
    analyzers/
    llm/
    validators/
    reporters/
    memory/
  reports/
    latest/
  tests/
    fixtures/
```

## Required Modes

### 1. Full Repo Review

Command:

```bash
review-harness repo /path/to/repo
```

Should inspect:

- file tree
- language mix
- build files
- test files
- scripts
- docs
- dependency manifests
- large files
- suspicious hardcoded paths
- TODO, FIXME, and security comments

### 2. Diff Review

Command:

```bash
review-harness diff /path/to/repo
```

Should inspect:

- unstaged changes
- staged changes
- branch diff against main or master
- changed functions where possible
- risk introduced by change

### 3. Scrum Test

Command:

```bash
review-harness scrum /path/to/repo
```

Should produce:

```text
reports/latest/scrum-test.md
reports/latest/risk-register.md
reports/latest/claim-coverage-table.md
reports/latest/sprint-backlog.md
reports/latest/acceptance-gates.md
reports/latest/receipts.json
```

### 4. Rules Audit

Command:

```bash
review-harness rules /path/to/repo
```

Reads:

```text
.review-rules.md
.review-profile.yaml
```

Then checks whether the repository violates local project rules.

### 5. Local Model Probe

Command:

```bash
review-harness model doctor
```

Should test:

- Ollama availability
- configured model exists
- context limit estimate
- small prompt response
- JSON-mode reliability if available
- timeout behavior
- fallback model behavior

## Local Model Requirements

Support a model endpoint abstraction.

Initial provider:

```yaml
provider: ollama
base_url: http://localhost:11434
model: qwen2.5-coder
fallback_model: llama3.1
timeout_seconds: 120
temperature: 0.1
```

Do not hardcode Ollama everywhere. Use a provider interface so OpenAI-compatible local endpoints can be added later.

## Review Pipeline

Pipeline should run in phases.

### Phase 0: Repo Intake

Collect:

- repo path
- git status
- current branch
- latest commit
- language breakdown
- file count
- largest files
- dependency manifests
- test manifests

Output:

```text
repo_intake.json
```

### Phase 1: Static Scan

Detect:

- hardcoded absolute paths
- raw SQL interpolation
- shell command execution
- unsafe environment handling
- broad CORS
- exposed mutation endpoints
- suspicious secret patterns
- unchecked file reads and writes
- missing error handling
- excessive file size
- missing tests near critical code

Output:

```text
static_findings.json
```

### Phase 2: LLM Review

Send bounded chunks to the local model.

The model must return strict JSON:

```json
{
  "findings": [
    {
      "title": "",
      "severity": "low|medium|high|critical",
      "file": "",
      "line_hint": "",
      "evidence": "",
      "reason": "",
      "suggested_fix": "",
      "confidence": 0.0
    }
  ]
}
```

If model output is invalid JSON, retry once with a repair prompt.

If the output is still invalid, save raw output and mark the model phase degraded.

### Phase 3: Validation

Every LLM finding must be validated against actual files.

Reject findings that:

- point to missing files
- cite text that does not exist
- make unsupported claims
- recommend unrelated rewrites
- lack evidence

Output:

```text
validated_findings.json
```

### Phase 4: Report Generation

Generate Markdown reports:

- executive summary
- risk register
- sprint backlog
- acceptance gates
- test gaps
- architecture drift
- suggested next commands

### Phase 5: Memory

Create local memory files:

```text
.memory/review-rules.md
.memory/known-risks.json
.memory/fixed-patterns.json
.memory/project-profile.json
```

Memory should be append-only by default.

Never silently overwrite prior memory. Version it.

## Validation Rules

Hard rules:

1. No hallucinated files.
2. No invented tests.
3. No fake command success.
4. No "appears to work" language without evidence.
5. Every finding must include:
   - file path
   - evidence snippet
   - risk
   - suggested next action
6. Reports must distinguish:
   - confirmed issue
   - suspected issue
   - missing evidence
   - blocked by unavailable dependency

## First Implementation Target

Do not build everything at once.

Implement MVP:

```text
Phase 0 repo intake
Phase 1 static scan
Phase 4 report generation
Basic Ollama model doctor
```

Then add LLM review after the static evidence pipeline is stable.

## MVP Acceptance Criteria

The MVP passes when:

```bash
review-harness repo .
review-harness scrum .
review-harness model doctor
```

produce usable output without crashing.

Required files:

```text
reports/latest/repo-intake.json
reports/latest/static-findings.json
reports/latest/scrum-test.md
reports/latest/risk-register.md
reports/latest/sprint-backlog.md
reports/latest/receipts.json
```

## Suggested Static Checks For MVP

Implement these first:

- hardcoded `/home/`
- hardcoded `/root/`
- hardcoded local IP addresses
- `exec(`
- `spawn(`
- `Command::new`
- raw SQL patterns:
  - `format!("SELECT`
  - string interpolation near SQL keywords
  - template literals containing `SELECT`
  - template literals containing `INSERT`
  - template literals containing `UPDATE`
  - template literals containing `DELETE`
- `Access-Control-Allow-Origin: *`
- committed `.env` files
- private key patterns
- files over 800 lines
- TODO, FIXME, and HACK count
- missing test directory
- package or build files without corresponding test command

## Output Style

Reports should be blunt and operational.

No motivational filler.

Use sections:

```text
Verdict
Evidence
Confirmed Risks
Suspected Risks
Blocked Checks
Sprint Backlog
Acceptance Gates
Next Commands
```

## Final Deliverable

After implementation, produce:

```text
docs/REVIEW_PIPELINE.md
docs/LOCAL_MODEL_SETUP.md
docs/REPORT_SCHEMA.md
reports/latest/*
```

Then run the harness against this repository itself and include the self-review report.

## Do Not

- Do not require GitHub.
- Do not require cloud LLMs.
- Do not pretend local model output is authoritative.
- Do not rewrite the target repository.
- Do not make destructive changes.
- Do not auto-commit.
- Do not hide degraded model failures.

## Strategic Goal

This should become the local review node for a larger autonomous development system.

Eventually it should plug into:

- OpenClaw
- MCP tools
- local lakehouse memory
- playbook sealing
- CI verification
- observer review loop

But first: make the local review harness reliable, inspectable, and evidence-driven.