llm-team-ui

Go to file

root 59379c624d Fix Ollama timeout: set num_ctx dynamically, truncate oversized prompts

Root cause: query_ollama() sent no num_ctx option, so Ollama defaulted
to 2048 tokens. Research mode with 15 questions builds prompts that
exceed model context windows, causing Ollama to hang until the 300s
timeout.

Fix:
- Calculate num_ctx from prompt size + 1024 token response buffer
- Cap at model's actual context limit
- Truncate prompts that exceed context window minus 512 response tokens
- Uses smart_truncate() to preserve start + end of prompt
- Updated MODEL_CONTEXT map with accurate limits for all local models

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-26 02:29:11 -05:00

server

Add triage, backup, and disaster recovery system

2026-03-25 04:52:48 -05:00

.gitignore

LLM Team UI v1.0 — full-stack local AI orchestration platform

2026-03-25 02:51:36 -05:00

llm_team_config.json

LLM Team UI v1.0 — full-stack local AI orchestration platform

2026-03-25 02:51:36 -05:00

llm_team_ui.py

Fix Ollama timeout: set num_ctx dynamically, truncate oversized prompts