llm-team-ui

Author	SHA1	Message	Date
root	7b9b7f6641	Add optimization history, reconnect, and duplicate prevention History detail panel now shows optimization results: - If a run has been optimized, shows results section with best score, original score, and link to view the winning variation - Fetches full optimization history via GET /api/optimize-history/<id> - Shows count of optimizations run and child variation count - Button changes to "Re-Optimize" for already-optimized runs Reconnect to active optimizations: - If optimization is already running, returns job_id in error response - Frontend detects this and reconnects to the SSE stream - No more losing progress when navigating away and coming back - Refactored startOptimize() into startOptimize() + _showOptimizeStream() New endpoint: GET /api/optimize-history/<run_id> - Returns all pipeline_runs where pipeline='optimize' for that parent - Returns all child team_runs created by optimization - Includes scores, strategies, rankings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 07:20:01 -05:00
root	bc2ad7c1a9	Fix Lab UX: visual selection, auto-navigate, live status, stuck detection Lab experiment selection: - Selected experiment now highlighted with accent border + glow - Clicking auto-navigates to relevant tab (config if idle, monitor if running) - No more silent toast-only feedback Live status display: - SSE "status" events now rendered in monitor (were silently dropped before) - Shows real-time: "Proposing change... (trial 3/50)" during execution - Error messages displayed inline instead of just toast Stuck experiment fix: - On app startup, reset all "running" experiments to "paused" - Prevents ghost "running" status after service restart - Fixed experiments 2, 3, 4 that showed running but had dead threads Trial cap fix: - Changed from lifetime cap (trial_num < 50) to per-run cap (trials_this_run < 50) - Prevents runaway experiments like #1 that accumulated 3762 trials - Shows trial progress in status: "trial 3/50" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 07:14:12 -05:00
root	3b4fa449f1	Add Auto-Optimize: AI agent for history-driven prompt improvement When viewing any past run in History, click "Optimize" to trigger an automated workflow that: 1. Analyzes the original prompt + responses + score 2. Identifies improvement strategies (clarity, depth, specificity, etc.) 3. Generates 3-5 improved prompt variations 4. Tests each variation across original mode + brainstorm 5. Auto-scores all results via background judge 6. Ranks results and highlights the winner 7. "Use This" button loads winning prompt into composer Architecture: - _run_optimize(job_id, run_id): background thread, 5-phase engine - POST /api/runs/<id>/optimize: starts optimization job - GET /api/optimize/<job_id>/stream: SSE for live progress - Budget-capped at 15 model calls per optimization - Child runs saved as real team_runs (source: "optimize") - Auto-scored → feeds into analytics + routing table automatically - Results saved to pipeline_runs (pipeline: "optimize") Frontend: - "Optimize" button in history detail panel (accent-colored) - startOptimize(runId): replaces detail view with live optimization stream - Phase cards: Analysis → Variations → Testing → Ranked Results - Score bars with color coding (green/amber/red) - Winner row highlighted with star + "Use This" button Closes the learning loop: system studies its own history → generates better prompts → tests them → scores results → routing table improves. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 07:03:27 -05:00
root	8ad221b41f	Add self-improving pipeline: auto-scoring, analytics, reactive refine, routing intelligence Phase 1 — Run Quality Scoring: - Auto-score every run in background via qwen2.5 judge (1-10) - Thumbs up/down vote buttons on output cards - POST /api/runs/<id>/score for user feedback - run_saved SSE event enables vote buttons after run completes - User votes override auto-scores (race-condition safe) - DB: quality_score, score_method, score_metadata on team_runs Phase 1 — Analytics Dashboard: - GET /api/admin/analytics: score-by-mode, score-by-model, heatmap, trend - New Analytics tab on Admin page with bar charts, heatmap table, trend sparkline - Scoring coverage tracker (scored vs total runs) - Model × Mode heatmap with color-coded cells Phase 2 — Reactive Pipeline: - _assess_stage(): orchestrator evaluates each stage's output mid-run - _reactive_decide(): can insert/skip stages based on assessment - Dynamic stage loop replaces fixed iteration in run_refine() - Budget tracking prevents infinite loops (max_stages hard cap) - Reactive decisions render as dashed notification bars between cards - Pipeline adjusts in real-time: "Inserting VALIDATE — high severity gaps found" Phase 3 — Cross-Run Learning: - _build_routing_table(): queries historical scores for model×mode performance - Best stage sequences per content_type from pipeline_runs - Routing table cached with 30-min TTL - Auto-Refine strategist prompt augmented with historical data - GET /api/suggest-models?mode=X returns top 3 models for that mode Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 06:18:32 -05:00
root	c2cc211f21	Expand sample prompts to 5 per tier across all 21 modes (315 total) Each mode now has {basic: [...], mid: [...], advanced: [...]} with 5 prompts per difficulty level. Renderer picks one random prompt from each tier on every mode switch, so users see fresh examples each time. 315 hand-crafted prompts designed to highlight each mode's strengths: - brainstorm: creative problem-solving at increasing scale - pipeline: multi-step transformations from simple to complex - debate: ethical dilemmas with escalating nuance - validator: common myths to complex historical misconceptions - roundrobin: writing tasks that benefit from iterative refinement - redteam: security vulnerabilities from obvious to systemic - consensus: opinion questions from clear to deeply contested - codereview: coding tasks from functions to distributed systems - ladder: concepts that scale from kindergarten to PhD - tournament: creative competitions from one-liners to algorithms - evolution: optimization targets from names to city infrastructure - blindassembly: decomposable projects from explanations to systems - staircase: progressive constraints from party planning to treaties - drift: factual claims from simple dates to complex event sequences - mesh: stakeholder analysis from office policies to life-or-death - hallucination: fact-checkable claims from simple to obscure - timeloop: cascading failures from restaurants to civilization - research: deep dives from single topics to geopolitical analysis - eval: benchmark prompts from trivia to formal proofs - extract: structured extraction from sentences to legal documents - refine: documents from product blurbs to architecture specs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 05:22:35 -05:00
root	0d09bb5293	Add Auto-Refine mode, composer UX, select dropdown fixes Auto-Refine mode (21st mode): - AI strategist analyzes content type and quality - Selects 3-5 optimal refinement stages from 8 available (validate, critique, expand, structure, stakeholder, clarity, edge_cases, align) - Executes stages sequentially with output chaining - Final synthesis produces polished version - Stages are content-aware — PRD gets different pipeline than essay - Saved to pipeline_runs DB Composer UX overhaul: - Initial state: full-screen centered composer overlay - Mode grid + models + prompt front-and-center for new users - On Run: composer closes, output takes full screen width - "New Prompt" button in header nav bar (not floating) - Close button (×) on composer overlay - Works across all 4 themes + mobile Dropdown fixes: - Dark theme: select options get solid #1a1d23 bg - Modern theme: select options get solid #18181b bg - Light/Reddit: select options get white bg with dark text - Native <option> elements now readable in all themes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 05:12:35 -05:00
root	713f18a65f	Add 4-theme system, fix enrichment panel layout, enable Docker on boot Theme system (Dark/Light/Reddit/Modern): - Injectable CSS/JS via after_request — zero template changes - Dark: original gold accent on black - Light: warm off-white with indigo accent, readable buttons - Reddit: bluish-gray bg, orange accent, pill buttons, 8px corners - Modern: glassmorphism dark, blue accent, frosted cards, 16px corners - Toggle cycles all 4 themes, persists via localStorage - Button injected into every page header automatically Enrichment panel fix: - threat-card changed from display:flex to display:grid - enrich-panel now spans full width via grid-column:1/-1 - Added .enrich-section/.enrich-title/.enrich-grid CSS classes - Sections (Geo, Deep Scan, AI) visually separated with dividers Iterate/repipe modal themed for all modes: - Light themes get white modal bg, proper contrast - Reddit gets rounded corners + orange accent - Modern gets glassmorphism modal with blue glow Scrollbar styling across all themes: - Rounded, properly sized (6-8px), theme-colored thumbs - macOS-style inset look via background-clip Layout improvements: - Output area min-height 400px, padding-bottom 40px - Empty state centered with more breathing room - Docker + containerd enabled at boot for web-check survival Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 04:31:01 -05:00
root	411040f206	Fix IP banning: nginx deny list + connection kill for instant enforcement fail2ban was using nftables action while UFW uses iptables-nft, so bans were recorded but never enforced. Added three-layer ban enforcement: 1. nginx deny list (/etc/nginx/banned_ips.conf) for instant 403 2. ss -K to kill existing TCP connections on ban 3. Auto-sync nginx deny file on ban/unban (manual, mass, AI sentinel) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 13:05:49 -05:00
root	eea8ff46db	Three-tier access: Off → Demo → Showcase Off: login required for everything Demo: public gets Team UI + run modes + admin page (browse only) Blocked: /logs, /admin/monitor, /history, threat intel APIs, sentinel, wall-of-shame, meta-pipelines, self-reports, vectors Showcase: public gets full read-only access to ALL pages Allowed: admin, monitor, logs, threat intel, enrichment, lab, history, self-analysis, meta-pipelines Blocked: config changes, bans, deletes, bulk operations Admin (logged in): full access to everything always SHOWCASE_ONLY_ROUTES set defines which pages/APIs are blocked in basic demo but allowed in showcase mode. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 05:29:39 -05:00
root	ffd5e43709	Fix demo/showcase toggle: separate buttons, distinct modes Problem: plain toggle set showcase=true, so demo always became showcase. No way to enable basic demo mode separately. Fix: - Three explicit buttons: [Demo] [Showcase] [Off] - Demo mode: active=true, showcase=false (team UI only) - Showcase mode: active=true, showcase=true (full read-only admin) - Off: both false - Plain toggle cycles demo on/off without touching showcase - Clear status text shows which mode and what it means Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 05:26:48 -05:00
root	732f29d836	Fix showcase toggle: remove /api/demo/toggle from blocked POSTs The demo toggle route was in DEMO_BLOCKED_POSTS, so once showcase was enabled, the before_request handler blocked the toggle POST even for admins (the before_request check ran before the route's own admin check could verify the session). Fix: removed /api/demo/toggle from blocked list. The route already has its own admin-only check (line 460). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 05:24:45 -05:00
root	f0cf69b4bd	Fix NameError: ADMIN_WRITE_ROUTES renamed to DEMO_BLOCKED_POSTS before_request handler still referenced old variable name. Updated to use DEMO_BLOCKED_POSTS with simpler path-in-set check. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 05:23:01 -05:00
root	9f48a050c8	Showcase Mode: full read-only admin access for client demos New mode: Showcase (replaces basic demo mode for client demos) - Visitors see EVERYTHING: Admin, Monitor, Logs, Threat Intel, Lab, History, Meta-Pipelines — all without logging in - Read-only: all GET requests allowed on all routes - Allowed POSTs: team runs, self-analysis, IP enrichment (read-like operations that don't modify system config) - Blocked POSTs: config changes, bans, deletes, bulk archive Admin UI (Security tab): - "Enable Showcase" button (magenta) — one click to activate - "Turn Off" button appears when active - Clear description of what visitors can and can't do - Status shows "SHOWCASE MODE" with magenta styling Banner: - Magenta gradient banner on all pages when showcase is active - Shows: "Showcase Mode — Full Read-Only Access — Admin · Monitor · Logs · Lab · History" - Demo button in nav shows "Showcase" in magenta Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 05:19:41 -05:00
root	dfab02f114	Fix meta-pipeline detail panel collapsing on auto-refresh Auto-refresh now skips when any detail panel is open (checks for meta-detail-* elements). Panel stays stable while reading results. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 05:07:47 -05:00
root	c9901dbc94	Meta-pipeline UI: add Stop/Restart/Results controls per pipeline Each pipeline card now shows: - Status dot + name + status tag + best score - Stop button (red) when running - Restart button (green) when stopped/completed - Results button (magenta) to drill into iterations - Live progress text when running - Stages and iteration count on info line Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 05:06:24 -05:00
root	28df789745	Fix runaway experiments: cap at 50 trials, fix DB permissions Bugs fixed: - Ratchet loop had no trial cap — experiment #1 ran 3762 trials unchecked. Now capped at max_trials=50 per start cycle. - meta_pipelines, meta_runs, self_reports tables had no GRANT for kbuser — fixed permissions for all tables and sequences. All 4 running experiments auto-paused on restart. Stress test confirms all tables accessible, all models responding, meta-pipeline creation working, self-report save/retrieve working. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:56:37 -05:00
root	4dc561af12	Meta-Pipeline: self-improving multi-mode chains on real system data Engine: - Chains modes in sequence: extract → research → validate → debate → synthesize - Each stage feeds its output to the next as input - Runs same pipeline with different model sets (one model per iteration) - Auto-scores final output using judge model (1-10) - Keeps best result across all iterations - All stage results + final outputs saved to meta_runs table 4 preset pipelines: 1. Security Deep Dive — security logs through 5-stage analysis 2. Run History Insights — team run data through 4-stage extraction 3. Threat Intel Enrichment — profiled IPs through 5-stage analysis 4. Cross-Report Synthesis — past self-reports through 4-stage debate Database: - meta_pipelines: name, source, stages, status, best_score, iterations - meta_runs: per-iteration stage results, final output, score, models API: - POST /api/meta-pipeline — create pipeline from preset - POST /api/meta-pipeline/:id/start — run in background - POST /api/meta-pipeline/:id/stop — halt execution - GET /api/meta-pipelines — list all with live status - GET /api/meta-pipeline/:id — full detail with all iteration results UI (Lab page): - Magenta-bordered Meta-Pipeline card with 4 clickable presets - Click preset → creates + auto-starts pipeline - Pipeline list with live status dots, progress, scores - Click pipeline → drill-down with per-iteration results - Each stage expandable (click to show output) - Best output highlighted in green border - Auto-refreshes every 5 seconds during runs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:54:08 -05:00
root	804898b658	Auto-save self-analysis reports to DB with browsable history Database: - self_reports table: report_type, model, report text, data_size, timestamp - Reports auto-saved on generation (no extra step needed) API: - GET /api/self-reports — list all past reports (id, type, model, size, date) - GET /api/self-reports/:id — full report text UI: - "✓ Saved as report #N" indicator after generation - "Past Reports (N)" section below self-analysis buttons - Click any past report → expands inline (toggle on/off) - Shows: type, model, timestamp for each saved report - Reports persist across page refreshes and restarts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:49:17 -05:00
root	28e641f939	Self-Analysis: AI reports from system's own data + Lab experiments API 4 one-click self-analysis reports in Lab: 1. Threat Intelligence Report — security logs → attack taxonomy, attacker profiling, predictive analysis, recommendations 2. Model Performance Analysis — 96 team runs → usage patterns, model workload, response efficiency, optimization opportunities 3. Usage Analytics — nginx access logs → traffic patterns, feature usage, user journey mapping, UX recommendations 4. Security Posture Assessment — combined audit of security logs, sentinel verdicts, fail2ban, threat intel DB → risk rating API: POST /api/self-analyze - type: threat_intel\|model_performance\|access_patterns\|security_posture - model: which local model to use (default qwen2.5) - Returns structured report from real system data Lab UI: - Green-bordered Self-Analysis card above experiment templates - Click any report → runs analysis in background → result panel expands inline with full report (scrollable, closeable) - Loading state shows "Analyzing..." during generation Each report analyzes REAL data: actual security logs, actual run history, actual nginx access patterns — not synthetic test data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:42:07 -05:00
root	ca660cbd10	Lab: add 3 experiment templates with auto-fill Templates section below experiment list: BASIC — Better Summaries (3 eval cases) Optimize summarization quality. Tests across biology, history, and technical content. Shows the simplest Lab workflow. INTERMEDIATE — Code Explainer (4 eval cases) Find the best prompt+model to explain code to non-programmers. Tests loops, recursion, error handling, comprehensions. Shows how the ratchet evolves system prompts. ADVANCED — Security Analyst Persona (5 eval cases) Evolve a cybersecurity AI across threat classification, executive summaries, developer education, incident response, and forensics. Tests multi-audience adaptation and domain expertise. Click any template → auto-fills the create form with name, objective, metric, all eval cases, and selects all available models. User can modify before creating. Each template card shows: level badge (green/amber/red), name, eval case count, and a description explaining what the experiment does and why it matters. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:32:39 -05:00
root	f34e05168b	Retheme Lab page: retro-brutalist matching all other pages - Full theme swap: amber accents, JetBrains Mono, 2px borders - Animated dot-grid background + scanlines - Backdrop-filter blur on cards - Status pills: square with borders (was rounded) - Model chips: square with 2px borders - Chart wraps: dark background with 2px borders - Trial items: monospace numbers and scores - Best config box: monospace with green border - Nav bar with links to Team, History, Admin, Logs - Toast: monospace with fade-out animation - Config textarea: monospace font with dark background - Responsive: tabs compact on mobile Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:28:29 -05:00
root	efa547bb68	Full history page with tags, notes, vector API, and bulk ops New /history page (replaces slide-out panel): - Full-page data table: ID, Mode, Prompt, Models, Tags, Date - Active/Archived/All view toggle - Filter by mode, tag, or search text - Checkbox select for bulk archive/restore - Click any row → detail panel with full responses Per-run detail: - Inline tag editor: add tags (Enter), remove tags (click ✕) - Notes textarea with auto-save (1s debounce) - Archive/Restore/Delete buttons - Collapsible response cards (click header to expand) Database: - tags TEXT[] column with GIN index for fast tag queries - notes TEXT column for freeform annotations APIs: - POST /api/runs/:id/tags — update tags and/or notes - GET /api/runs/tags — list all unique tags in use - GET /api/runs/vectors — structured text documents for AI/embedding Returns: mode, prompt, models, date, tags, notes + all response text Filters: ?mode=, ?tag=, ?limit= Each doc includes token estimate for embedding planning Main UI: History button now links to /history page Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:19:22 -05:00
root	aeab1f0194	Archive/restore history: soft-delete with toggle and bulk ops Database: - Added 'archived' boolean column to team_runs (indexed) - Active runs filtered by archived=false by default API: - GET /api/runs?show=active\|archived\|all - POST /api/runs/:id/archive — archive single run - POST /api/runs/:id/restore — restore single run - POST /api/runs/bulk-archive — archive/restore by IDs or date History panel UI: - Active/Archived toggle tabs at top - Per-run Archive button (magenta) in detail view - Per-run Restore button (green) in detail view for archived runs - "Archive All" bulk button when viewing active runs - "Restore All" bulk button when viewing archived runs - Archived runs hidden from active view, accessible anytime Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:13:33 -05:00
root	7948089f04	Fix sentinel countdown: sync to actual scan schedule, not page load - Sentinel thread sets next_scan_ts = time.time() + interval BEFORE sleeping - API returns next_scan_in derived from real next_scan_ts, not estimated - Frontend calculates server clock offset and counts down to the actual target timestamp — refresh shows the same remaining time, not a reset - Shows ✓ in green when scan fires, resumes countdown on next poll Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:07:15 -05:00
root	357918013d	Compact sentinel card: single-line with mini ring + collapsible verdicts - Entire sentinel status fits in one header row now - Mini 28px countdown ring (was 64px) inline with title - Scans/bans counts inline as text, not grid boxes - Verdicts collapsed by default — click to expand - Card padding reduced (8px vs 14px) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:03:57 -05:00
root	3cdfc01835	Sentinel countdown ring timer with live stats - SVG progress ring shows time until next scan (magenta arc) - Countdown ticks every second: "245s → 244s → ... → scanning..." - Ring fills as time progresses, resets on scan - Turns green and shows "scanning..." when timer hits 0 - Stats grid: Scans count, AI Bans count, Last Run time, Interval - Backend API returns elapsed_since_scan and next_scan_in Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:01:09 -05:00
root	418da99fa7	Wall of Shame: persistent threat intel database with drill-down table Database: - threat_intel table with full enrichment data per IP - UPSERT on IP — re-enriching updates existing record - Stores: geo, AI analysis, web-check results, indicators, raw JSON - Indexed on IP (unique), threat_level, enriched_at Auto-save: - Every enrichment auto-saves to DB (step 5 in enrichment pipeline) - "Saved to Wall of Shame database" indicator in enrichment panel - No duplicate scans — re-enrich updates the existing record Wall of Shame tab (/logs): - Stats bar: Total Profiled, Critical, High, Proxies, Automated - Sortable table: IP, Threat, Type, Summary, Country, Ports - Click any row to expand full detail: ISP, Org, ASN, City, Proxy/Hosting flags, Confidence, Blocklist count, Pattern, Recommendation, Indicators - All data persists across restarts — no re-scanning needed API: - /api/admin/wall-of-shame — list all enriched IPs with sorting/filtering Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 03:52:34 -05:00
root	e7f12a6d93	Tighten AI security prompts — aggressive stance for private server Enrichment AI prompt: - Explicitly states this is a PRIVATE application - Strict threat level rules: 10+ blocklists = always critical, exploit scans = always critical, SSH-only = suspicious - Added "compromised_host" classification option - Recommendation options: ban permanently, ban 24h, monitor, ignore Sentinel batch prompt: - "Err on the side of banning" directive - .env.production/.env.local probing = targeted recon, instant ban - When in doubt, BAN — private server has no public scanning excuse - Tighter rules for automated UA detection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 03:49:17 -05:00
root	3c4846d52c	Expand web-check enrichment: traceroute, headers, status, full rendering Now queries 6 web-check endpoints per IP: - ports — open port scan - dns — reverse DNS / PTR records - block-lists — DNS blocklist check (AdGuard, CloudFlare, etc.) - trace-route — full network path with per-hop latency - headers — HTTP response headers (server, powered-by, etc.) - status — HTTP status code and response time Frontend rendering: - Traceroute displayed as hop chips with latency: IP (45ms) → IP (56ms) - HTTP status with response time - Server headers inline - Errors silently skipped (many endpoints fail on raw IPs) AI analysis now includes: - Blocklist count and names in prompt - Traceroute hops in prompt for network path analysis Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 03:46:43 -05:00
root	51ffd2b82c	Fix enrichment: run web-check before AI analysis so data is available Web-check (ports, DNS, blocklists) now runs as step 3, AI analysis as step 4. AI prompt includes open ports and blocklist status for richer threat verdicts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 03:42:23 -05:00
root	e816e81820	Integrate web-check Docker for deep IP enrichment Setup: - lissy93/web-check running in Docker on port 3000 - Queries ports, DNS, and blocklist endpoints per IP Enrichment now includes 4 layers: 1. Geolocation (ip-api.com) — country, ISP, proxy/hosting flags 2. Web-Check deep scan — open ports, DNS/PTR, blocklist status 3. Security log aggregation — all activity for that IP 4. AI analysis (qwen2.5) — gets ALL above data as context Frontend rendering: - Open ports displayed in red (security risk indicators) - Blocklist status: "3/8 blocked (AdGuard, AdGuard Family, ...)" - Reverse DNS (PTR records) - All data feeds into AI analysis prompt for richer verdicts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 03:39:42 -05:00
root	472a5d0917	IP threat intel: sorting, mass ban, enrichment with geo + AI analysis Sorting: - Sort by: hits, threat level, recent activity, banned status - Active sort button highlighted in amber Mass operations: - Checkbox per IP for multi-select - "Ban Selected" / "Unban Selected" buttons with confirmation - /api/admin/security/mass-ban endpoint handles batch operations - Selection counter shows "N selected" IP Enrichment (click "Enrich" button per IP): - Geolocation via ip-api.com (country, city, ISP, org, AS number) - Proxy/hosting/mobile detection flags (red for proxy/hosting) - AI threat analysis via local qwen2.5: - Threat level, classification, confidence score - Attack pattern description - Specific indicators list - Automated detection flag - Actionable recommendation - Enrichment panel expands inline below the IP card (toggle) Per-IP drill-down: - Expandable raw log lines per IP (click to show/hide) - User agent listing with count - First seen / last seen timestamps - HTTP method breakdown (GET:5 POST:2) - AI sentinel verdicts shown inline - Jail information for banned IPs Enhanced backend: - Security API returns per-IP log lines, first_seen, methods, event_types - AI verdicts attached to IP records - Multiple UA detection (fingerprint: rotating scanner) - Sort parameter support (?sort=threat\|hits\|recent\|banned) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 03:24:32 -05:00
root	de4ca533dd	AI Security Sentinel: local LLM scans logs every 5 minutes Background thread runs qwen2.5 to analyze new security log entries: - Aggregates new entries by IP since last scan - Sends batch to local LLM with security analysis prompt - LLM classifies each IP: threat level, action, attack type, reason - Auto-bans IPs the AI recommends banning (via fail2ban) - Logs all verdicts and bans to /var/log/llm-team-sentinel.log - Logs AI bans to security log as AI_BAN events API: - /api/admin/sentinel — sentinel status, stats, recent verdicts Threat Intel tab enhancement: - Sentinel status card with magenta accent (distinct from threat cards) - Shows: model, scan count, ban count, last run, interval - Recent AI verdicts table: action, IP, attack type, reason - Errors displayed inline Security prompt tuning: - Explicit rules for common attack patterns - Low temperature (0.1) for consistent classification - JSON-only response format for reliable parsing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 03:08:02 -05:00
root	f1bb2a92e7	Interactive threat intelligence dashboard with one-click ban Security API: - /api/admin/security — aggregates security log into per-IP threat intel (hit count, exploit scans, login fails, paths probed, threat level) - /api/admin/security/ban — manual ban/unban via fail2ban (logs MANUAL_BAN/MANUAL_UNBAN to security log) Threat Intel tab in /logs: - Summary stats: Critical IPs, High Threat, Currently Banned - Per-IP cards showing: threat level, hit count, scan count, paths probed - Critical IPs have red border, high threat amber - One-click "Ban 24h" button per IP (calls fail2ban-client banip) - One-click "Unban" for currently banned IPs - Banned IPs shown at reduced opacity - LAN IPs (192.168.*) filtered out fail2ban tuning: - llm-team-exploit findtime: 600s → 3600s (catch slow scanners) - llm-team-exploit maxretry: 3 → 2 (more aggressive) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 03:05:01 -05:00
root	21c8c2a3e5	Monitor: drill-down pipeline view with step timeline Highlander pattern — one view at each level, clean transitions: Level 1 - Run List: - Active runs (live, with progress bars) - Recent runs (in-memory session runs) - History from DB (all saved runs, click to drill down) Level 2 - Pipeline Detail (click any DB run): - Breadcrumb nav: Monitor → mode #id - Header card with mode, models, timestamp, full prompt - Step timeline with dot indicators on a vertical line - Each step shows: model, role tag, character count, token estimate - Green dots for completed, red for errors Level 3 - Response Text (click any step): - Accordion expand/collapse on click - Full response text in monospace scrollable container - Smooth max-height transition Architecture ready for Level 4 (future AI comparison): - Responses are individually addressable by step index - Role-based grouping visible in timeline - Side-by-side view can be added per-step Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 02:58:34 -05:00
root	9af071df6c	Retheme admin page, improve save feedback, add monitor nav link Admin UI: - Full retro-brutalist theme matching main UI - JetBrains Mono headings, amber accent, 2px borders - Animated dot-grid background + scanlines - Square toggles (was rounded) - Backdrop-filter blur on cards - Nav bar with links to Team, Lab, Logs, Monitor Save feedback: - Every save now verifies the API response (checks d.ok) - Toast shows what was saved: "ollama provider saved / Enabled" - Toast shows details: "Cloud models saved / 3 models configured" - Toast shows timeout details: "Timeouts saved / Global: 300s, 2 overrides" - Failed saves show red toast with error message - Toast fade-out animation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 02:44:58 -05:00
root	344e11f4b2	Replace GoAccess with built-in log viewer, clickable error links New /logs page with 5 tabs: - App Log (journalctl for llm-team-ui service) - Run History (all completed runs with errors inline) - Nginx Errors (with red highlighting) - Nginx Access (with color-coded status codes) - Security Log (fail2ban/exploit detection) Features: - Live text filter (grep-style) - Configurable line limit (50-500) - Auto-refresh every 10s - Run history shows mode, user, duration, response count, errors - Error lines highlighted red, warnings amber - Status codes color-coded (2xx green, 3xx blue, 4xx amber, 5xx red) Error linking: - Stream errors in main UI link to /admin/monitor - Error response cards have "View error details" link - Error cards styled with red border and monospace body Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 02:35:17 -05:00
root	59379c624d	Fix Ollama timeout: set num_ctx dynamically, truncate oversized prompts Root cause: query_ollama() sent no num_ctx option, so Ollama defaulted to 2048 tokens. Research mode with 15 questions builds prompts that exceed model context windows, causing Ollama to hang until the 300s timeout. Fix: - Calculate num_ctx from prompt size + 1024 token response buffer - Cap at model's actual context limit - Truncate prompts that exceed context window minus 512 response tokens - Uses smart_truncate() to preserve start + end of prompt - Updated MODEL_CONTEXT map with accurate limits for all local models Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 02:29:11 -05:00
root	1ac7a436e6	Add live metrics dashboard to progress panel 8 real-time metrics in the progress panel: - Elapsed time (updates every 500ms) - Models active/total (tracks unique models as they respond) - Responses received (count) - Estimated tokens (~chars/4) - Data received (formatted KB) - SSE events (total protocol events) - Errors (turns red if > 0) - Heartbeats (keepalive count) Metrics update every 500ms during run. On completion, all metric values turn green. Magenta/purple theme for metric values, micro labels underneath. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 01:55:29 -05:00
root	c507ba1016	Progress bar: magenta→cyan gradient with green completion - Border: magenta (#d946ef) with purple glow - Fill: gradient from magenta → purple → cyan - Shimmer animation sweeps across the fill - Step indicators: magenta active pulse with glow - Completed steps: magenta→green gradient - Phase labels: bright green with gradient fade line - Completion: green→cyan gradient with green glow - 8px height track (was 6px) for better visibility - All text in progress panel uses purple/pink tones - Clearly distinct from the amber UI elements Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 01:35:00 -05:00
root	9eaac813df	Sticky progress bar, phase labels, auto-scroll - Progress panel is now position:sticky at top of output — always visible - Phase labels (─── scouting ───, ─── researching ───, etc.) appear between response cards when the pipeline role changes - Auto-scroll to latest response card as they arrive - Completion state shows response count and fades after 5s - Clear previous errors: all 'input stream' errors were caused by service restarts during in-flight runs, not code bugs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 01:30:53 -05:00
root	c124b01681	Fix SSE stream reliability: threaded server, async keepalive, streaming responses - Enable Flask threaded=True for concurrent request handling - Refactor generate() to use producer-consumer queue pattern: - Runner executes in background thread, pushes events to queue - Heartbeat thread sends keepalive every 10s independently - Generator reads from queue — stream never goes silent - Brainstorm mode: stream responses as they arrive (was waiting for all) - Prevents nginx/browser timeout during long model queries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 01:27:42 -05:00
root	242dec7509	Add progress tracking, admin monitor, SSE keepalive, research hardening Backend: - Active run tracking with step/substep/error state - SSE keepalive heartbeat every 15s to prevent nginx timeout - Run log (last 100 completed runs with timing/errors) - Research mode: per-question progress, context caps, graceful failures - Hard cap on research questions (15), answer truncation (8K chars) Frontend: - Real progress bar with step segments, elapsed time, event counter - Progress shimmer animation, step completion indicators - Improved error display with timing context - Green completion state with fade Admin: - /admin/monitor — live process dashboard - Stats: active runs, completed, errors, avg duration - Active run cards with live progress, substep detail, errors - Recent run history with error traces - Auto-polls every 3 seconds - Full retro-brutalist theme matching main UI Nginx: - proxy_read_timeout 600s, proxy_send_timeout 600s - proxy_buffering off for SSE streaming Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 01:22:36 -05:00
root	8cbc2bec84	Redesign UI: neo-brutalist retro-futuristic aesthetic - New color palette: amber/gold accent, deep black backgrounds - JetBrains Mono for headings, labels, and system text - 2px borders, 2px border-radius (brutalist) - Animated dot-grid background canvas with random scanline artifacts - CRT scanline overlay + vignette effect - Backdrop-filter blur on panels for glass depth - Pulsing status dot, amber glow effects - Login page: full retro treatment with sys-tag footer - All functional elements preserved Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 01:09:40 -05:00
root	d651c52a59	Add sample prompt chips for all 20 modes Three demo prompts per mode (basic/mid/advanced) that showcase each orchestration pattern's unique value. Clickable chips below the prompt textarea auto-fill on click with green flash feedback. Prompts swap dynamically when switching modes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 00:56:55 -05:00
root	a0ee901f66	Add security hardening: logging, email alerts, exploit detection - Security logging to /var/log/llm-team-security.log for fail2ban - Email alerts for security events via SMTP - Exploit pattern detection (scanner probes, SQL injection, path traversal) - Use X-Real-IP header for accurate client IP behind nginx Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 00:46:25 -05:00
root	2bb910b72c	Add triage, backup, and disaster recovery system - brain-backup: daily borg + pg_dump, 7d/4w/3m retention, cron at 3AM - brain-triage: full system health check (services, ports, firewall, headers, kernel, app, DB, disk, backups, security scan) - brain-recover: restore from backup (full/db/configs/app) + emergency lockdown mode that blocks all external access except LAN SSH All accessible via /usr/local/bin/brain-{backup,triage,recover} Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 04:52:48 -05:00
root	6ea457d01d	Add server security configs and setup script - Nginx configs with security headers (X-Frame-Options, CSP, etc.) - fail2ban jails for nginx (botsearch, bad-request, forbidden) - Kernel hardening via sysctl (rp_filter, no redirects, log martians) - SSH hardening (no root, max 3 attempts, no X11) - UFW rules export - Idempotent setup.sh to restore all configs on fresh install - Flask bound to 127.0.0.1 (nginx-only access) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 04:47:54 -05:00
root	0d00ced622	Mobile-optimized layout: output-first, collapsible mode selector - Output panel renders first on mobile (CSS order swap) - Prompt + Run button immediately below output - Mode/config hidden behind "Mode: Brainstorm" collapsible toggle - Tapping toggle expands full mode grid + model config - Compact header nav with smaller text - 3-column mode grid on mobile (was 4) - Larger run button (16px font, 14px padding) for touch - Full-width repipe modal and history panel on mobile - Desktop layout unchanged (toggle hidden, collapse always open) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 04:01:36 -05:00
root	e3207b9c8e	Make /logs strictly admin-only, never accessible in demo mode Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 03:50:49 -05:00

1 2

55 Commits