lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Setup for the corpus-tightening experiment sweep (J 2026-04-26 — "now
is the only cheap window before the corpus gets large and refactoring
costs go up").
Override params on /v1/mode/execute (additive — old callers unaffected):
force_matrix_corpus — Pass 2: try alternate corpora per call
force_relevance_threshold — Pass 2: sweep filter strictness
force_temperature — Pass 3: variance test
New native mode `staffing_inference_lakehouse` (Pass 4):
- Same composer architecture as codereview_lakehouse
- Staffing framing: coordinator producing fillable|contingent|
unfillable verdict + ranked candidate list with playbook citations
- matrix_corpus = workers_500k_v8
- Validates that modes-as-prompt-molders generalizes beyond code
- Framing explicitly says "do NOT fabricate workers" — the staffing
analog of the lakehouse mode's symbol-grounding requirement
Three sweep harnesses:
scripts/mode_pass2_corpus_sweep.ts — 4 corpora × 4 thresholds × 5 files
scripts/mode_pass3_variance.ts — 3 files × 3 temps × 5 reps
scripts/mode_pass4_staffing.ts — 5 fill requests through staffing mode
Each appends per-call rows to data/_kb/mode_experiments.jsonl which
mode_compare.ts already aggregates with grounding column.
Pass 1 (10 files × 5 modes broad sweep) currently running via the
existing scripts/mode_experiment.ts — gateway restart deferred until
it completes so the new override knobs aren't enabled mid-experiment.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
J's framing (2026-04-26): "Modes are how you ask ONCE and get BETTER
information — they mold the data, hyperfocus the prompt on this
codebase's needs, so the model gets it right the first time without
the cascading retry ladder."
Built the first concrete native enrichment runner (codereview_lakehouse)
that composes every context primitive the gateway exposes:
1. Focus file content (read from disk OR caller-supplied)
2. Pathway memory bug_fingerprints for this file area (ADR-021
preamble — "📚 BUGS PREVIOUSLY FOUND IN THIS FILE AREA")
3. Matrix corpus search via the task_class's matrix_corpus
4. Relevance filter (observer /relevance) drops adjacency pollution
5. Assembles ONE precise prompt with system framing
6. Single call to /v1/chat with the recommended model
POST /v1/mode/execute dispatches. Native mode → runs the composer.
Non-native mode → 501 NOT_IMPLEMENTED with hint (proxy to LLM Team
/api/run is queued).
Provider hint logic auto-routes by model name shape:
- vendor/model[:tag] → openrouter
- kimi-*/qwen3-coder*/deepseek-v*/mistral-large* → ollama_cloud
- everything else → local ollama
Live test against crates/queryd/src/delta.rs (10593 bytes, 10
historical bug fingerprints, 2 matrix chunks dropped by relevance):
- enriched_chars: 12876
- response_chars: 16346 (14 findings with confidence percentages)
- Model literally cited the pathway memory preamble in finding #7
- One call to free-tier gpt-oss:120b produced what previously
required the 9-rung escalation ladder
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
HANDOVER §queued (2026-04-25): "Mode router — port LLM Team multi-model
patterns. Pick the right TOOL/MODE for each task class via the matrix,
not cascade through models."
Two-stage architecture:
1. Decision (POST /v1/mode) — pure recommendation, no execution.
Returns {mode, model, decision: {source, fallbacks, matrix_corpus,
notes}} so callers see WHY this mode was picked.
2. Execution (future POST /v1/mode/execute) — proxy to LLM Team
/api/run for modes not yet ported to native Rust runners. Not
wired in this phase.
Splitting decision from execution lets us A/B-test the routing logic
without committing to running every recommendation. The decision
function is pure enough for exhaustive unit tests (3 added).
config/modes.toml — initial map for 5 task_classes (scrum_review,
contract_analysis, staffing_inference, fact_extract, doc_drift_check)
+ a default. matrix_corpus per task is reserved for the future
matrix-informed routing pass.
VALID_MODES list (24 modes) is kept in sync manually with LLM Team's
/api/run handler at /root/llm_team_ui.py:10581. Adding a mode here
without adding it upstream returns 400 from a future proxy.
GET /v1/mode/list — operator introspection so a UI can render the
registry table without re-parsing TOML.
Live-tested: 5 task classes match, unknown classes fall through to
default, force_mode override works + validates, bogus modes return
400 with the valid_modes list.
Updates reference_llm_team_modes.md memory — earlier note claiming
"only extract is registered" was wrong (all 25 are registered).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>