Claude (review-harness setup) e346b54e0f Phase C — local-Ollama LLM review wired end-to-end
Implements PROMPT.md / docs/REVIEW_PIPELINE.md Phase 2:
- internal/llm/ollama.go — real Ollama provider:
  - HealthCheck probes /api/tags + a 1-token completion + a JSON-mode
    probe ({"ok": true} round-trip), populating the model-doctor.json
    schema documented in docs/LOCAL_MODEL_SETUP.md
  - Complete + CompleteJSON via /api/chat with stream=false
  - think=false set for ALL completions (qwen3.5:latest is reasoning-
    capable but the inner-loop hot path wants direct answers, not
    reasoning traces consuming the token budget — same finding as
    the Lakehouse-Go chatd 2026-04-30 wave)
- internal/llm/review.go — Reviewer wrapper:
  - 2-attempt flow: prompt → parse → repair-prompt → parse
  - Strict JSON shape enforced; markdown fences stripped before parse
  - Severity normalized to enum; out-of-range confidence clamped
  - Per-file chunking (file-level for v0; function-level Phase D+)
  - Bounded by review-profile max_file_bytes + max_llm_chunk_chars
- pipeline.go — Phase 2 wired between static scan + report gen:
  - --enable-llm flag opts in (off by default — static-only is
    cheaper and faster)
  - Raw output ALWAYS saved to llm-findings.raw.json (forensics)
  - Normalized findings → llm-findings.normalized.json
  - LLM findings merged into the report findings list (sourced
    "llm" so consumers can filter)
  - Receipts honestly mark phase status: "ok" | "degraded" | "skipped"
- cli model doctor — real probes replace the Phase A stub.

Verified:
- model doctor: status="ok" with qwen3.5:latest + qwen3:latest both
  loaded, basic_prompt_ok=true, json_mode_ok=true
- insecure-repo with --enable-llm: 9 LLM findings; qwen3.5 correctly
  flagged SQLi, RCE, hardcoded credentials as critical with verbatim
  evidence; 27s wall for 3 chunks
- clean-repo with --enable-llm: 0 LLM findings, 4 parsed chunks, 2.8s
- self-review with --enable-llm: 77 LLM findings + 83 static; 3 of
  ~30 chunks needed retry (PROMPT.md, REPORT_SCHEMA.md,
  SCRUM_TEST_TEMPLATE.md — all eventually parsed); 5min wall

go vet + go test -short clean. Fixture stray.go now `package fixture`
so go-tooling doesn't choke on the orphan.

Phase D (validator cross-check) + Phase E (memory + diff/rules
subcommands) remain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 01:13:39 -05:00

152 lines
11 KiB
JSON

[
{
"chunk_id": "src/handler.go",
"findings": [
{
"id": "",
"title": "Hardcoded file path for secrets",
"severity": "high",
"status": "suspected",
"file": "src/handler.go",
"line_hint": "10",
"evidence": "const HARDCODED_PATH = \"/home/profit/secrets/key.pem\"",
"reason": "Hardcoding a file path for a private key exposes secrets and prevents proper secret management.",
"suggested_fix": "Move the path to an environment variable or a configuration file outside the source code.",
"source": "llm",
"confidence": 1,
"check_id": "llm.review"
},
{
"id": "",
"title": "Hardcoded server IP address",
"severity": "medium",
"status": "suspected",
"file": "src/handler.go",
"line_hint": "11",
"evidence": "const SERVER_IP = \"192.168.1.176\"",
"reason": "Hardcoding an IP address reduces portability and may leak internal network topology.",
"suggested_fix": "Read the server IP from an environment variable.",
"source": "llm",
"confidence": 1,
"check_id": "llm.review"
},
{
"id": "",
"title": "SQL Injection vulnerability",
"severity": "critical",
"status": "suspected",
"file": "src/handler.go",
"line_hint": "15-16",
"evidence": "q := fmt.Sprintf(\"SELECT * FROM users WHERE name = '%s'\", name)\ndb.Query(q)",
"reason": "Using string formatting to construct SQL queries directly exposes the application to SQL injection attacks.",
"suggested_fix": "Use parameterized queries with placeholders instead of string formatting.",
"source": "llm",
"confidence": 1,
"check_id": "llm.review"
},
{
"id": "",
"title": "Unsafe shell command execution",
"severity": "critical",
"status": "suspected",
"file": "src/handler.go",
"line_hint": "19-20",
"evidence": "exec.Command(\"bash\", \"-c\", cmd).Run()",
"reason": "Executing arbitrary shell commands without validation allows for remote code execution.",
"suggested_fix": "Validate and sanitize the input command strictly, or avoid using shell execution entirely.",
"source": "llm",
"confidence": 1,
"check_id": "llm.review"
},
{
"id": "",
"title": "Hardcoded API key",
"severity": "critical",
"status": "suspected",
"file": "src/handler.go",
"line_hint": "23",
"evidence": "const API_KEY = \"sk-1234567890abcdefABCDEFGHIJKLMNOPQRSTUV\"",
"reason": "Hardcoding an API key in source code exposes sensitive credentials to anyone with access to the repository.",
"suggested_fix": "Store the API key in a secure environment variable or secrets manager.",
"source": "llm",
"confidence": 1,
"check_id": "llm.review"
}
],
"raw_content": "{\n \"findings\": [\n {\n \"title\": \"Hardcoded file path for secrets\",\n \"severity\": \"high\",\n \"file\": \"src/handler.go\",\n \"line_hint\": \"10\",\n \"evidence\": \"const HARDCODED_PATH = \\\"/home/profit/secrets/key.pem\\\"\",\n \"reason\": \"Hardcoding a file path for a private key exposes secrets and prevents proper secret management.\",\n \"suggested_fix\": \"Move the path to an environment variable or a configuration file outside the source code.\",\n \"confidence\": 1.0\n },\n {\n \"title\": \"Hardcoded server IP address\",\n \"severity\": \"medium\",\n \"file\": \"src/handler.go\",\n \"line_hint\": \"11\",\n \"evidence\": \"const SERVER_IP = \\\"192.168.1.176\\\"\",\n \"reason\": \"Hardcoding an IP address reduces portability and may leak internal network topology.\",\n \"suggested_fix\": \"Read the server IP from an environment variable.\",\n \"confidence\": 1.0\n },\n {\n \"title\": \"SQL Injection vulnerability\",\n \"severity\": \"critical\",\n \"file\": \"src/handler.go\",\n \"line_hint\": \"15-16\",\n \"evidence\": \"q := fmt.Sprintf(\\\"SELECT * FROM users WHERE name = '%s'\\\", name)\\ndb.Query(q)\",\n \"reason\": \"Using string formatting to construct SQL queries directly exposes the application to SQL injection attacks.\",\n \"suggested_fix\": \"Use parameterized queries with placeholders instead of string formatting.\",\n \"confidence\": 1.0\n },\n {\n \"title\": \"Unsafe shell command execution\",\n \"severity\": \"critical\",\n \"file\": \"src/handler.go\",\n \"line_hint\": \"19-20\",\n \"evidence\": \"exec.Command(\\\"bash\\\", \\\"-c\\\", cmd).Run()\",\n \"reason\": \"Executing arbitrary shell commands without validation allows for remote code execution.\",\n \"suggested_fix\": \"Validate and sanitize the input command strictly, or avoid using shell execution entirely.\",\n \"confidence\": 1.0\n },\n {\n \"title\": \"Hardcoded API key\",\n \"severity\": \"critical\",\n \"file\": \"src/handler.go\",\n \"line_hint\": \"23\",\n \"evidence\": \"const API_KEY = \\\"sk-1234567890abcdefABCDEFGHIJKLMNOPQRSTUV\\\"\",\n \"reason\": \"Hardcoding an API key in source code exposes sensitive credentials to anyone with access to the repository.\",\n \"suggested_fix\": \"Store the API key in a secure environment variable or secrets manager.\",\n \"confidence\": 1.0\n }\n ]\n}",
"parsed": true,
"retried": false
},
{
"chunk_id": "src/huge.go",
"findings": [],
"raw_content": "```json\n{\n \"error\": \"No valid content found. The input appears to be a list of generated line markers without any actual text or data to process.\",\n \"status\": \"empty_input\"\n}\n```",
"parsed": true,
"retried": false
},
{
"chunk_id": "src/server.js",
"findings": [
{
"id": "",
"title": "CORS misconfiguration allows cross-origin attacks",
"severity": "high",
"status": "suspected",
"file": "src/server.js",
"line_hint": "2",
"evidence": "res.setHeader(\"Access-Control-Allow-Origin\", \"*\");",
"reason": "Allowing all origins (*) exposes the API to cross-site request forgery and data theft from any website.",
"suggested_fix": "Restrict Access-Control-Allow-Origin to specific trusted domains or use credentials with a specific origin.",
"source": "llm",
"confidence": 1,
"check_id": "llm.review"
},
{
"id": "",
"title": "Hardcoded AWS access key in source code",
"severity": "critical",
"status": "suspected",
"file": "src/server.js",
"line_hint": "5",
"evidence": "const AWS_KEY = \"AKIAIOSFODNN7EXAMPLE\";",
"reason": "Hardcoded credentials in source code pose a severe security risk as they can be easily leaked and misused.",
"suggested_fix": "Use environment variables or a secure secrets manager to store AWS credentials.",
"source": "llm",
"confidence": 1,
"check_id": "llm.review"
},
{
"id": "",
"title": "Missing authentication on user creation endpoint",
"severity": "high",
"status": "suspected",
"file": "src/server.js",
"line_hint": "7",
"evidence": "app.post(\"/api/users\", function(req, res) { /* no auth */ });",
"reason": "The /api/users endpoint lacks authentication, allowing anyone to create or modify user accounts.",
"suggested_fix": "Implement authentication middleware to verify user identity before allowing POST requests.",
"source": "llm",
"confidence": 1,
"check_id": "llm.review"
},
{
"id": "",
"title": "Missing authentication on admin deletion endpoint",
"severity": "critical",
"status": "suspected",
"file": "src/server.js",
"line_hint": "8",
"evidence": "app.delete(\"/api/admin\", function(req, res) { /* no auth */ });",
"reason": "The /api/admin endpoint lacks authentication, allowing unauthenticated users to delete administrative resources.",
"suggested_fix": "Implement strict authentication and authorization checks for all admin endpoints.",
"source": "llm",
"confidence": 1,
"check_id": "llm.review"
}
],
"raw_content": "{\n \"findings\": [\n {\n \"title\": \"CORS misconfiguration allows cross-origin attacks\",\n \"severity\": \"high\",\n \"file\": \"src/server.js\",\n \"line_hint\": \"2\",\n \"evidence\": \"res.setHeader(\\\"Access-Control-Allow-Origin\\\", \\\"*\\\");\",\n \"reason\": \"Allowing all origins (*) exposes the API to cross-site request forgery and data theft from any website.\",\n \"suggested_fix\": \"Restrict Access-Control-Allow-Origin to specific trusted domains or use credentials with a specific origin.\",\n \"confidence\": 1.0\n },\n {\n \"title\": \"Hardcoded AWS access key in source code\",\n \"severity\": \"critical\",\n \"file\": \"src/server.js\",\n \"line_hint\": \"5\",\n \"evidence\": \"const AWS_KEY = \\\"AKIAIOSFODNN7EXAMPLE\\\";\",\n \"reason\": \"Hardcoded credentials in source code pose a severe security risk as they can be easily leaked and misused.\",\n \"suggested_fix\": \"Use environment variables or a secure secrets manager to store AWS credentials.\",\n \"confidence\": 1.0\n },\n {\n \"title\": \"Missing authentication on user creation endpoint\",\n \"severity\": \"high\",\n \"file\": \"src/server.js\",\n \"line_hint\": \"7\",\n \"evidence\": \"app.post(\\\"/api/users\\\", function(req, res) { /* no auth */ });\",\n \"reason\": \"The /api/users endpoint lacks authentication, allowing anyone to create or modify user accounts.\",\n \"suggested_fix\": \"Implement authentication middleware to verify user identity before allowing POST requests.\",\n \"confidence\": 1.0\n },\n {\n \"title\": \"Missing authentication on admin deletion endpoint\",\n \"severity\": \"critical\",\n \"file\": \"src/server.js\",\n \"line_hint\": \"8\",\n \"evidence\": \"app.delete(\\\"/api/admin\\\", function(req, res) { /* no auth */ });\",\n \"reason\": \"The /api/admin endpoint lacks authentication, allowing unauthenticated users to delete administrative resources.\",\n \"suggested_fix\": \"Implement strict authentication and authorization checks for all admin endpoints.\",\n \"confidence\": 1.0\n }\n ]\n}",
"parsed": true,
"retried": false
}
]