scrum: cloud-default models — swap mistral:latest for ollama_cloud::gpt-oss:120b #1

Open
profit wants to merge 1 commits from scrum/cloud-default-models into main
Owner

Summary

Surfaced by the lakehouse scrum-master pipeline (run 2026-04-24) pointed at this repos source. The scrum found three hardcoded mistral:latest defaults in the meta-pipeline orchestrator paths. Per feedback_no_mistral.md (lakehouse project memory), mistral 7B has decoder-level JSON malformation (0/5 fill on structured-output A/B) and is unreliable in paths that consume structured output.

Change: swap to ollama_cloud::gpt-oss:120b (Phase 20 T3 cloud tier, proven workhorse).

Patch locations (3)

  • llm_team_ui.py:9959model_sets default for meta-pipeline stages
  • llm_team_ui.py:10084 — fallback when Ollama /api/tags probe throws
  • llm_team_ui.py:11835 — default workers list for orchestrator mode

All three are DEFAULTS — callers passing explicit config.model_sets / config.models are unaffected.

⚠ This PR carries 4 additional pre-existing local commits

Base main (local) was 4 commits ahead of origin/main before this branch was cut — those commits were already committed locally, just unpushed. They are included here as prerequisites (specifically fa6ccff Ollama Cloud provider + model browser + OpenRouter key fix provides the ollama_cloud:: routing prefix this PR relies on):

  • fa6ccff Ollama Cloud provider + model browser + OpenRouter key fix
  • 98bda6e OpenRouter: show all 343 models (free + paid) with pricing and filter
  • 34ee12e Fix adaptive mode: model list + synthesizer dropdown were never populated
  • 205eff6 Deep Analysis mode + token tracking for all runs (including public)
  • 12ab391 scrum: swap mistral:latest defaults to ollama_cloud::gpt-oss:120b ← this PR

Reviewer options: (a) merge as-is, (b) reject and first push main to origin, then rebase this onto updated origin/main (will leave only commit 12ab391), or (c) cherry-pick just 12ab391 onto a smaller branch that explicitly pulls in fa6ccff only.

Runtime activation

Ollama Cloud requires OLLAMA_CLOUD_API_KEY env var or a key saved via the Admin UI (providers.ollama_cloud.api_key). This PR does not change credential behavior, only the default model list. Without a key configured, the orchestrator will fail on the cloud default and the user should set explicit config.model_sets with local models.

Not in this PR (other scrum findings, not acted on)

Scrum also surfaced: (F2) no 3-tier access middleware, (F3) no sentinel loop, (F4) no cloud-determinism consensus, (F5) saveProvider no bearer validation. After grep verification: F2, F3 turned out to be false positives from tree-split distillation signal loss (rate_limited + is_allowlisted + sentinel loop are all implemented; reviewer couldn’t see them in the 209-shard scratchpad). F4 and F5 are partial / inconclusive and would need a forensic-preamble rerun before being actionable.

## Summary Surfaced by the lakehouse scrum-master pipeline (run 2026-04-24) pointed at this repos source. The scrum found three hardcoded `mistral:latest` defaults in the meta-pipeline orchestrator paths. Per `feedback_no_mistral.md` (lakehouse project memory), mistral 7B has decoder-level JSON malformation (0/5 fill on structured-output A/B) and is unreliable in paths that consume structured output. **Change:** swap to `ollama_cloud::gpt-oss:120b` (Phase 20 T3 cloud tier, proven workhorse). ## Patch locations (3) - `llm_team_ui.py:9959` — `model_sets` default for meta-pipeline stages - `llm_team_ui.py:10084` — fallback when Ollama `/api/tags` probe throws - `llm_team_ui.py:11835` — default `workers` list for orchestrator mode All three are DEFAULTS — callers passing explicit `config.model_sets` / `config.models` are unaffected. ## ⚠ This PR carries 4 additional pre-existing local commits Base `main` (local) was 4 commits ahead of `origin/main` before this branch was cut — those commits were already committed locally, just unpushed. They are included here as prerequisites (specifically `fa6ccff Ollama Cloud provider + model browser + OpenRouter key fix` provides the `ollama_cloud::` routing prefix this PR relies on): - `fa6ccff` Ollama Cloud provider + model browser + OpenRouter key fix - `98bda6e` OpenRouter: show all 343 models (free + paid) with pricing and filter - `34ee12e` Fix adaptive mode: model list + synthesizer dropdown were never populated - `205eff6` Deep Analysis mode + token tracking for all runs (including public) - `12ab391` scrum: swap mistral:latest defaults to ollama_cloud::gpt-oss:120b ← this PR Reviewer options: (a) merge as-is, (b) reject and first push `main` to origin, then rebase this onto updated origin/main (will leave only commit `12ab391`), or (c) cherry-pick just `12ab391` onto a smaller branch that explicitly pulls in `fa6ccff` only. ## Runtime activation Ollama Cloud requires `OLLAMA_CLOUD_API_KEY` env var or a key saved via the Admin UI (`providers.ollama_cloud.api_key`). This PR does not change credential behavior, only the default model list. Without a key configured, the orchestrator will fail on the cloud default and the user should set explicit `config.model_sets` with local models. ## Not in this PR (other scrum findings, not acted on) Scrum also surfaced: (F2) no 3-tier access middleware, (F3) no sentinel loop, (F4) no cloud-determinism consensus, (F5) `saveProvider` no bearer validation. After grep verification: F2, F3 turned out to be **false positives** from tree-split distillation signal loss (rate_limited + is_allowlisted + sentinel loop are all implemented; reviewer couldn’t see them in the 209-shard scratchpad). F4 and F5 are partial / inconclusive and would need a forensic-preamble rerun before being actionable.
profit added 5 commits 2026-04-24 11:10:24 +00:00
New provider: Ollama Cloud (ollama.com)
- Native Ollama chat API with bearer token auth
- Provider card in Admin → Providers tab
- "Ollama Cloud" tab with Pull Models button (fetches 36 models)
- Search/filter models, one-click Add
- Models route as ollama_cloud::modelname through query_ollama_cloud()
- Test button verifies connection

OpenRouter fix:
- Cleared bad API key from config (was dd00bea4... not sk-or-)
- Real key from /home/profit/.env now used (sk-or-v1-579...)
- Fixed OpenAI provider that had wrong base_url (ollama.com→api.openai.com)
- Bumped OR timeout to 180s for free model rate limits

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Endpoint now returns all models, not just free ones
- Each model includes: name, context_length, free flag, prompt/completion cost
- UI shows pricing: "128K ctx · $2.50/M tok" for paid, "128K ctx · free" for free
- Filter dropdown: All Models / Free Only / Paid Only
- Search still works alongside the filter
- 29 free + 314 paid models available (GPT-5.4, Grok 4.20, Gemini 3.1, etc)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two bugs preventing adaptive mode from working:
- ml-adaptive was missing from ML_IDS array (line 3160) — model checkboxes
  never rendered, so no models were selected
- adaptive-synthesizer was missing from populateAllSelects() ids array
  (line 3733) — synthesizer/judge dropdown was always empty

Both are single-line fixes. The backend pipeline (run_adaptive) was
complete and correct — self-eval, RAG retrieval, escalation, quality
scoring, KB storage all work. The UI just never wired the config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New mode: Deep Analysis — 6-phase autonomous pipeline:
1. Research: all selected models answer in parallel
2. Debate: models challenge each other's findings
3. Consensus: merge research with critiques, identify strong/weak points
4. Self-Eval: structured scoring (accuracy, completeness, actionability, nuance)
5. Final Synthesis: strongest model produces definitive answer
6. Knowledge Base: result stored for future RAG retrieval

Designed for cloud models (Ollama Cloud, OpenRouter). Every successful
run trains the local knowledge base so future adaptive runs benefit.
Purple accent in mode selector to distinguish from standard modes.

Token tracking fix:
- Added est_tokens, input_chars, output_chars columns to team_runs
- save_run() now calculates and stores token estimates for ALL runs
- Both logged-in and public/demo/showcase runs track tokens
- Enables accurate usage analytics across all users

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three default model lists hardcoded mistral:latest as the fallback
when config.get("model_sets" / "models") returns nothing. Per
feedback_no_mistral.md, mistral 7B has decoder-level JSON malformation
issues (0/5 fill rate on A/B) and is a liability in any path that
depends on structured output from the model.

Swapping to ollama_cloud::gpt-oss:120b (Phase 20 T3 cloud tier)
keeps the defaults reliable for the meta-pipeline orchestrator
(line 9959), the fallback model list for empty Ollama (10084), and
the worker pool default (11835). All three are DEFAULTS — any caller
passing explicit config.model_sets / config.models is unaffected.

Routing works because query_model's "::" provider prefix already
resolves ollama_cloud via commit fa6ccff. Activation requires
OLLAMA_CLOUD_API_KEY or a key saved via the Admin UI; this PR does
not change credential behavior, only the default model list.

Surfaced by lakehouse scrum-master pipeline run 2026-04-24, findings
confirmed by grep verification against the live code.
This pull request can be merged automatically.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin scrum/cloud-default-models:scrum/cloud-default-models
git checkout scrum/cloud-default-models
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: profit/llm-team-ui#1
No description provided.