The orchestrator J described: pulls git repo source + PRD +
suggested-changes doc, chunks them, hands each code piece through
the proven escalation ladder with learning context, collects
per-file suggestions in a consolidated handoff report.
Composes ONLY already-shipped primitives — no new core code:
- chunker with 800-char / 120-overlap windows
- sidecar /embed for real nomic-embed-text embeddings
- in-memory cosine retrieval for top-5 PRD + top-5 proposal
chunks per target file
- escalation ladder (qwen3.5 → qwen3 → gpt-oss:20b → gpt-oss:120b
→ devstral-2:123b → mistral-large-3:675b)
- per-attempt learning-context injection (prior failures as
"do not repeat" block)
- acceptance rubric (length ≥ 200 chars + structured form)
Live-run (tests/real-world/runs/scrum_moatqkee/):
targets: 3 files
- crates/vectord/src/playbook_memory.rs (920 lines)
- crates/vectord/src/doc_drift.rs (163 lines)
- auditor/audit.ts (170 lines)
resolved: 3/3 on attempt 1 by qwen3.5:latest local 7B
total duration: 111.7s
output: scrum_report.md + per-file JSON
Sample from scrum_report.md (playbook_memory.rs review):
- Alignment score: 9/10 vs PRD Phase 19
- 4 concrete change suggestions naming specific lines + PLAN/PRD
chunk offsets
- 3 gap analyses with PRD-reference citations
Honest findings from this run:
1. Local 7B handled review-style tasks first-try. The escalation
ladder infrastructure is live but didn't fire — review is an
easier task shape than strict code-generation (see hard_task
test which needed devstral-2 specialist).
2. 6KB file-truncation caused one false positive: model claimed
playbook_memory.rs lacks a `doc_refs` field, but that field
exists past the 6KB cutoff. Trade-off between context-size
and review-depth needs tuning per file.
3. Chunk-offset citations are real: model output includes
`[PRD @27880]` and `[PLAN @16320]` which map to the actual
byte offsets of retrieved context chunks. Auditor pattern could
adopt this for traceable claims.
This is the scrum-master-handoff shape J asked for:
repo + PRD + proposal → chunk → retrieve → escalate → consolidate
→ human-reviewable markdown report
Not shipping: per-PR diff analysis, open-PR integration, Gitea
posting of suggestions. Those compose the same primitives
differently — this proves the core pattern.
Env override: LH_SCRUM_FILES=path1,path2,... to target a different
file set. Default 3 files keeps runtime ~2min.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>