Templates section below experiment list: BASIC — Better Summaries (3 eval cases) Optimize summarization quality. Tests across biology, history, and technical content. Shows the simplest Lab workflow. INTERMEDIATE — Code Explainer (4 eval cases) Find the best prompt+model to explain code to non-programmers. Tests loops, recursion, error handling, comprehensions. Shows how the ratchet evolves system prompts. ADVANCED — Security Analyst Persona (5 eval cases) Evolve a cybersecurity AI across threat classification, executive summaries, developer education, incident response, and forensics. Tests multi-audience adaptation and domain expertise. Click any template → auto-fills the create form with name, objective, metric, all eval cases, and selects all available models. User can modify before creating. Each template card shows: level badge (green/amber/red), name, eval case count, and a description explaining what the experiment does and why it matters. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Description
LLM Team UI - Full-stack local AI orchestration platform
Languages
Python
97.4%
Shell
2.6%