lakehouse

profit/lakehouse

Fork 0

Commit Graph

Author	SHA1	Message	Date
root	56bf30cfd8	v1/mode: override knobs + staffing native runner + pass 2/3/4 harnesses Some checks failed lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts Setup for the corpus-tightening experiment sweep (J 2026-04-26 — "now is the only cheap window before the corpus gets large and refactoring costs go up"). Override params on /v1/mode/execute (additive — old callers unaffected): force_matrix_corpus — Pass 2: try alternate corpora per call force_relevance_threshold — Pass 2: sweep filter strictness force_temperature — Pass 3: variance test New native mode `staffing_inference_lakehouse` (Pass 4): - Same composer architecture as codereview_lakehouse - Staffing framing: coordinator producing fillable\|contingent\| unfillable verdict + ranked candidate list with playbook citations - matrix_corpus = workers_500k_v8 - Validates that modes-as-prompt-molders generalizes beyond code - Framing explicitly says "do NOT fabricate workers" — the staffing analog of the lakehouse mode's symbol-grounding requirement Three sweep harnesses: scripts/mode_pass2_corpus_sweep.ts — 4 corpora × 4 thresholds × 5 files scripts/mode_pass3_variance.ts — 3 files × 3 temps × 5 reps scripts/mode_pass4_staffing.ts — 5 fill requests through staffing mode Each appends per-call rows to data/_kb/mode_experiments.jsonl which mode_compare.ts already aggregates with grounding column. Pass 1 (10 files × 5 modes broad sweep) currently running via the existing scripts/mode_experiment.ts — gateway restart deferred until it completes so the new override knobs aren't enabled mid-experiment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 01:55:12 -05:00
root	86f63a083d	v1/mode: codereview_lakehouse native runner — modes are prompt-molders Some checks failed lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts J's framing (2026-04-26): "Modes are how you ask ONCE and get BETTER information — they mold the data, hyperfocus the prompt on this codebase's needs, so the model gets it right the first time without the cascading retry ladder." Built the first concrete native enrichment runner (codereview_lakehouse) that composes every context primitive the gateway exposes: 1. Focus file content (read from disk OR caller-supplied) 2. Pathway memory bug_fingerprints for this file area (ADR-021 preamble — "📚 BUGS PREVIOUSLY FOUND IN THIS FILE AREA") 3. Matrix corpus search via the task_class's matrix_corpus 4. Relevance filter (observer /relevance) drops adjacency pollution 5. Assembles ONE precise prompt with system framing 6. Single call to /v1/chat with the recommended model POST /v1/mode/execute dispatches. Native mode → runs the composer. Non-native mode → 501 NOT_IMPLEMENTED with hint (proxy to LLM Team /api/run is queued). Provider hint logic auto-routes by model name shape: - vendor/model[:tag] → openrouter - kimi-/qwen3-coder/deepseek-v/mistral-large → ollama_cloud - everything else → local ollama Live test against crates/queryd/src/delta.rs (10593 bytes, 10 historical bug fingerprints, 2 matrix chunks dropped by relevance): - enriched_chars: 12876 - response_chars: 16346 (14 findings with confidence percentages) - Model literally cited the pathway memory preamble in finding #7 - One call to free-tier gpt-oss:120b produced what previously required the 9-rung escalation ladder Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 00:28:46 -05:00
root	d277efbfd2	v1/mode: task_class → mode/model router (decision-only, phase 1) Some checks failed lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts HANDOVER §queued (2026-04-25): "Mode router — port LLM Team multi-model patterns. Pick the right TOOL/MODE for each task class via the matrix, not cascade through models." Two-stage architecture: 1. Decision (POST /v1/mode) — pure recommendation, no execution. Returns {mode, model, decision: {source, fallbacks, matrix_corpus, notes}} so callers see WHY this mode was picked. 2. Execution (future POST /v1/mode/execute) — proxy to LLM Team /api/run for modes not yet ported to native Rust runners. Not wired in this phase. Splitting decision from execution lets us A/B-test the routing logic without committing to running every recommendation. The decision function is pure enough for exhaustive unit tests (3 added). config/modes.toml — initial map for 5 task_classes (scrum_review, contract_analysis, staffing_inference, fact_extract, doc_drift_check) + a default. matrix_corpus per task is reserved for the future matrix-informed routing pass. VALID_MODES list (24 modes) is kept in sync manually with LLM Team's /api/run handler at /root/llm_team_ui.py:10581. Adding a mode here without adding it upstream returns 400 from a future proxy. GET /v1/mode/list — operator introspection so a UI can render the registry table without re-parsing TOML. Live-tested: 5 task classes match, unknown classes fall through to default, force_mode override works + validates, bogus modes return 400 with the valid_modes list. Updates reference_llm_team_modes.md memory — earlier note claiming "only extract is registered" was wrong (all 25 are registered). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 00:16:32 -05:00

Author

SHA1

Message

Date

root

56bf30cfd8

v1/mode: override knobs + staffing native runner + pass 2/3/4 harnesses

lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts

Setup for the corpus-tightening experiment sweep (J 2026-04-26 — "now
is the only cheap window before the corpus gets large and refactoring
costs go up").

Override params on /v1/mode/execute (additive — old callers unaffected):
  force_matrix_corpus      — Pass 2: try alternate corpora per call
  force_relevance_threshold — Pass 2: sweep filter strictness
  force_temperature         — Pass 3: variance test

New native mode `staffing_inference_lakehouse` (Pass 4):
  - Same composer architecture as codereview_lakehouse
  - Staffing framing: coordinator producing fillable|contingent|
    unfillable verdict + ranked candidate list with playbook citations
  - matrix_corpus = workers_500k_v8
  - Validates that modes-as-prompt-molders generalizes beyond code
  - Framing explicitly says "do NOT fabricate workers" — the staffing
    analog of the lakehouse mode's symbol-grounding requirement

Three sweep harnesses:
  scripts/mode_pass2_corpus_sweep.ts — 4 corpora × 4 thresholds × 5 files
  scripts/mode_pass3_variance.ts     — 3 files × 3 temps × 5 reps
  scripts/mode_pass4_staffing.ts     — 5 fill requests through staffing mode

Each appends per-call rows to data/_kb/mode_experiments.jsonl which
mode_compare.ts already aggregates with grounding column.

Pass 1 (10 files × 5 modes broad sweep) currently running via the
existing scripts/mode_experiment.ts — gateway restart deferred until
it completes so the new override knobs aren't enabled mid-experiment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-26 01:55:12 -05:00

root

86f63a083d

v1/mode: codereview_lakehouse native runner — modes are prompt-molders

lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts

J's framing (2026-04-26): "Modes are how you ask ONCE and get BETTER
information — they mold the data, hyperfocus the prompt on this
codebase's needs, so the model gets it right the first time without
the cascading retry ladder."

Built the first concrete native enrichment runner (codereview_lakehouse)
that composes every context primitive the gateway exposes:

  1. Focus file content (read from disk OR caller-supplied)
  2. Pathway memory bug_fingerprints for this file area (ADR-021
     preamble — "📚 BUGS PREVIOUSLY FOUND IN THIS FILE AREA")
  3. Matrix corpus search via the task_class's matrix_corpus
  4. Relevance filter (observer /relevance) drops adjacency pollution
  5. Assembles ONE precise prompt with system framing
  6. Single call to /v1/chat with the recommended model

POST /v1/mode/execute dispatches. Native mode → runs the composer.
Non-native mode → 501 NOT_IMPLEMENTED with hint (proxy to LLM Team
/api/run is queued).

Provider hint logic auto-routes by model name shape:
  - vendor/model[:tag] → openrouter
  - kimi-*/qwen3-coder*/deepseek-v*/mistral-large* → ollama_cloud
  - everything else → local ollama

Live test against crates/queryd/src/delta.rs (10593 bytes, 10
historical bug fingerprints, 2 matrix chunks dropped by relevance):
  - enriched_chars: 12876
  - response_chars: 16346 (14 findings with confidence percentages)
  - Model literally cited the pathway memory preamble in finding #7
  - One call to free-tier gpt-oss:120b produced what previously
    required the 9-rung escalation ladder

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-26 00:28:46 -05:00

root

d277efbfd2

v1/mode: task_class → mode/model router (decision-only, phase 1)

lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts

HANDOVER §queued (2026-04-25): "Mode router — port LLM Team multi-model
patterns. Pick the right TOOL/MODE for each task class via the matrix,
not cascade through models."

Two-stage architecture:
  1. Decision (POST /v1/mode) — pure recommendation, no execution.
     Returns {mode, model, decision: {source, fallbacks, matrix_corpus,
     notes}} so callers see WHY this mode was picked.
  2. Execution (future POST /v1/mode/execute) — proxy to LLM Team
     /api/run for modes not yet ported to native Rust runners. Not
     wired in this phase.

Splitting decision from execution lets us A/B-test the routing logic
without committing to running every recommendation. The decision
function is pure enough for exhaustive unit tests (3 added).

config/modes.toml — initial map for 5 task_classes (scrum_review,
contract_analysis, staffing_inference, fact_extract, doc_drift_check)
+ a default. matrix_corpus per task is reserved for the future
matrix-informed routing pass.

VALID_MODES list (24 modes) is kept in sync manually with LLM Team's
/api/run handler at /root/llm_team_ui.py:10581. Adding a mode here
without adding it upstream returns 400 from a future proxy.

GET /v1/mode/list — operator introspection so a UI can render the
registry table without re-parsing TOML.

Live-tested: 5 task classes match, unknown classes fall through to
default, force_mode override works + validates, bogus modes return
400 with the valid_modes list.

Updates reference_llm_team_modes.md memory — earlier note claiming
"only extract is registered" was wrong (all 25 are registered).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-26 00:16:32 -05:00

3 Commits