local-review-harness/docs/LOCAL_MODEL_SETUP.md

# Local Model Setup

## Purpose

The review harness should use local models first.

The first supported provider is Ollama.

The design must allow OpenAI-compatible local endpoints later.

## Default Ollama Profile

```yaml
provider: ollama
base_url: http://localhost:11434
model: qwen2.5-coder
fallback_model: llama3.1
timeout_seconds: 120
temperature: 0.1
```

## Model Doctor Command

The harness must provide:

```bash
review-harness model doctor
```

## Doctor Checks

The doctor command should test:

- Ollama server availability
- configured model availability
- fallback model availability
- basic prompt response
- JSON response reliability
- timeout behavior
- degraded-mode behavior

## Required Doctor Output

```text
reports/latest/model-doctor.json
```

## Required JSON Fields

```json
{
  "provider": "ollama",
  "base_url": "http://localhost:11434",
  "primary_model": "",
  "fallback_model": "",
  "server_available": false,
  "primary_model_available": false,
  "fallback_model_available": false,
  "basic_prompt_ok": false,
  "json_mode_ok": false,
  "timeout_seconds": 120,
  "status": "ok|degraded|failed",
  "errors": []
}
```

## Provider Interface

Do not hardcode Ollama into all logic.

Use a provider interface with these operations:

```text
list_models()
complete(prompt, options)
complete_json(prompt, schema, options)
health_check()
```

## Local Model Rules

- temperature should default low for review tasks
- prompts should request strict JSON where possible
- raw model output must be saved for failed parse attempts
- invalid model output must never be silently accepted
- fallback model usage must be recorded