root 8ad221b41f Add self-improving pipeline: auto-scoring, analytics, reactive refine, routing intelligence
Phase 1 — Run Quality Scoring:
- Auto-score every run in background via qwen2.5 judge (1-10)
- Thumbs up/down vote buttons on output cards
- POST /api/runs/<id>/score for user feedback
- run_saved SSE event enables vote buttons after run completes
- User votes override auto-scores (race-condition safe)
- DB: quality_score, score_method, score_metadata on team_runs

Phase 1 — Analytics Dashboard:
- GET /api/admin/analytics: score-by-mode, score-by-model, heatmap, trend
- New Analytics tab on Admin page with bar charts, heatmap table, trend sparkline
- Scoring coverage tracker (scored vs total runs)
- Model × Mode heatmap with color-coded cells

Phase 2 — Reactive Pipeline:
- _assess_stage(): orchestrator evaluates each stage's output mid-run
- _reactive_decide(): can insert/skip stages based on assessment
- Dynamic stage loop replaces fixed iteration in run_refine()
- Budget tracking prevents infinite loops (max_stages hard cap)
- Reactive decisions render as dashed notification bars between cards
- Pipeline adjusts in real-time: "Inserting VALIDATE — high severity gaps found"

Phase 3 — Cross-Run Learning:
- _build_routing_table(): queries historical scores for model×mode performance
- Best stage sequences per content_type from pipeline_runs
- Routing table cached with 30-min TTL
- Auto-Refine strategist prompt augmented with historical data
- GET /api/suggest-models?mode=X returns top 3 models for that mode

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 06:18:32 -05:00
Description
LLM Team UI - Full-stack local AI orchestration platform
9.2 MiB
Languages
Python 97.4%
Shell 2.6%