lakehouse/docs/SCRUM_MASTER_SPEC.md
root f753e11157
Some checks failed
lakehouse/auditor 9 blocking issues: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
docs: SCRUM_MASTER_SPEC timeline — productization wave + verified live state
Splits the existing 04-25/26 section into two waves:
- experiment wave (mode-runner build-out, pre-productization)
- productization wave (OpenAI-compat, Archon, answers corpus,
  staffing native runner, multi-corpus + downgrade gate, observer
  paid escalation, /v1/chat → observer event wiring)

Adds verified-live block at the end with the numbers a fresh session
needs to anchor on: pathway memory 88 traces / 11 successful replays
at 100% (probation gate crossed), strong-model auto-downgrade firing
on grok-4.1-fast, and the auditor blind spot at static.ts:117 (now
fixed in 107a682).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 20:50:05 -05:00

336 lines
23 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Scrum Master Pipeline — Spec + Current State
**Status:** Active iteration on branch `scrum/auto-apply-19814` → PR #11 at git.agentview.dev/profit/lakehouse
**Branch commit head:** see `git log --oneline -1 scrum/auto-apply-19814` (auto-stale; check it)
> **2026-04-25 — see also `docs/MATRIX_AGENT_HANDOVER.md`** for the
> standalone `matrix-agent-validated` repo split + the Ansible playbook
> that deploys it. Note: VPS at 192.168.1.145 is a TEST VENV ONLY
> (partial deploy); the real destination is the `matrix-test` Incus
> container at 10.111.129.50.
This doc is the single handoff artifact for the scrum-master + auto-apply + pathway-memory loop. A fresh Claude Code session reading this + `docs/DECISIONS.md` (ADR-020 and ADR-021) + `docs/MATRIX_AGENT_HANDOVER.md` + `MEMORY.md` should have the same context as the session that wrote it.
## ▶ Refactor timeline (read in order)
The pipeline has been refactored substantially since the 2026-04-24
baseline below. Read the changes top-down to understand current shape:
### 2026-04-23 → 24 (foundation, captured in §1-§12 below)
- 9-rung cloud ladder + tree-split + adversarial prompt
- Pathway memory base + ADR-021 semantic-correctness layer
- Hardened auto-applier (5 gates: confidence/size/cargo/warnings/rationale)
- Hand-review wire (commit `3f166a5`) — judgment moved out of inner loop
- Anchor-grounding post-verifier (commit `9cc0ceb` / `9ecc584`)
- Single-model retry with enrichment (commit `d187bcd`) — stop cascading on quality
- Unified matrix retriever pulling from ALL KB corpora (commit `a496ced`)
- Paid OpenRouter ladder + Kimi K2.6 + Gemini 2.5 (commit `4ac5656`)
- Goal-driven autonomous loop harness (commit `e79e51e`)
### 2026-04-25 → 26 morning (mode-runner experiment wave)
- **Observer health-probe TypeConfusion fix** (`54689d5`) — `r.json()` on text/plain `/health` was crash-looping the observer; sealed in pathway_memory as `TypeConfusion:fetch-health-json`.
- **Adjacency-pollution relevance filter** (`0115a60`) — observer `/relevance` endpoint + scrum wiring (`LH_RELEVANCE_FILTER` / `LH_RELEVANCE_THRESHOLD`). Drops chunks about symbols the focus file IMPORTS but doesn't define.
- **Audit-consensus → retire wire** (`626f18d`) — when observer rejects a hot-swap-recommended attempt, immediately call `/vectors/pathway/retire` on the trace. `HotSwapCandidate` gained `trace_uid` for single-trace precision. Confidence ≥0.7 gate avoids retiring on heuristic-fallback verdicts.
- **`/v1/mode` router phase 1** (`d277efb`) — task_class → mode/model decision endpoint with `config/modes.toml`. Decision-only; doesn't execute.
- **Native enrichment runner** (`86f63a0`) — `codereview_lakehouse` mode that COMPOSES every primitive (focus file + bug fingerprints + relevance-filtered matrix + adversarial framing) into ONE prompt for one-shot success. `POST /v1/mode/execute`. Modes-as-prompt-molders, not model-pickers — see ★ Insight from session 2026-04-26.
- **Parameterized runner + 5 experiment modes** (`7c47734`) — `codereview_lakehouse|null|isolation|matrix_only|playbook_only`. Each isolates one architectural axis. `scripts/mode_experiment.ts` sweeps files × modes; `scripts/mode_compare.ts` aggregates with grounding check (catches confabulation by comparing cited symbols to real file content).
- **Scrum mode-runner fast path** (`7c47734`) — gated by `LH_USE_MODE_RUNNER=1`, scrum tries `/v1/mode/execute` BEFORE the 9-rung ladder. Falls through to ladder if response < `LH_MODE_MIN_CHARS` or anything errors. Off by default until A/B-validated.
- **Mode-compare grounding column** (`52bb216`) emoji-tolerant section regex + control-flag tagging. Caught `playbook_only` confabulation that hand-grading also found.
### 2026-04-26 evening (productization wave)
- **Override knobs + staffing native runner** (`56bf30c`) pass 2/3/4 harnesses, mode runner now serves `staffing.fill` task class natively, not just code review.
- **Multi-corpus runner + variance harness + strong-model downgrade gate** (`2dbc8db`) three corpora (arch / findings / symbols) selectable per mode. Paid models auto-downgrade: skip matrix corpus, isolation framing only. Driven by `feedback_composed_corpora_anti_additive.md` (composed corpora LOST 5/5 vs isolation on grok-4.1-fast, p=0.031).
- **OpenAI-compat alias + smart provider routing** (`3a0b37e`) gateway is now a drop-in middleware for any OpenAI SDK consumer. Three routing flavors verified via `/tmp/archon-test/sdk-test.ts`: `openai/gpt-4o-mini`, bare `gpt-4o-mini`, `x-ai/grok-4.1-fast`.
- **OpenAI multimodal content shape** (`540a9a2`) accepts `content: [...]` array-of-parts.
- **`/v1/chat` fires observer event** (`d1d97a0`) every chat call now lands both Langfuse trace AND observer `/event` (was Langfuse only).
- **Archon workflow** (`69919d9`) `.archon/workflows/lakehouse-architect-review.yaml`. 3 Pi nodes (shape weakness improvement) using `openrouter/x-ai/grok-4.1-fast` through the gateway.
- **Observer KB enrichment preamble** (`d9bd4c9`) observer prepends KB context to escalation prompts (was raw failure cluster).
- **Observer escalation paid OpenRouter** (`340fca2`) `deepseek-v3.1-terminus` instead of free-tier rescue. Verified: diagnoses cite architectural patterns (circuit breaker, adapter files) instead of generic timeouts.
- **Gold-standard answer corpus** (`0844206`) `scripts/build_answers_corpus.ts` indexes `lakehouse_answers_v1` from `scrum_reviews.jsonl + observer_escalations.jsonl`. Doc ID prefixes (`review:` vs `escalation:`) let consumers same-file-gate or broaden. Auto-rebuilds from scrum epilogue (`LH_SCRUM_SKIP_ANSWERS_REBUILD=1` to disable). Observer `buildKbPreamble` now blends three sources (pathway + arch + answers); preamble grew 416 727 chars.
### Verified live state (2026-04-26 ~23:30)
- Pathway memory: **88 traces, 11/11 successful replays = 100%** hot-swap probation gate crossed; live recommendations firing.
- Strong-model auto-downgrade verified: scrum on grok-4.1-fast matrix corpus dropped, isolation mode auto-selected, 3 files accepted on attempt 1, ~27s each.
- Auditor verdict on PR #11 head `0844206`: **block** on 8 false positives `auditor/checks/static.ts:117` "field added but never read" check doesn't follow serde derives. Fix is in the auditor, not the code.
### Verified architectural insights (2026-04-26 experiment)
- `codereview_lakehouse` produces 100% grounded findings, beats every challenger.
- `codereview_playbook_only` (pathway-only, no file content) confabulates ~50% of findings keep as control, NEVER as recommendation.
- `codereview_null` (no enrichment, generic prompt) produces 0 ranked findings adversarial framing is load-bearing.
- Matrix corpus contributes ~2 grounded findings vs isolation. Small but real.
### Where to read what
- **Loop architecture (this doc, §1-§12):** original 2026-04-24 design.
- **Modes-as-enrichment vision:** `crates/gateway/src/v1/mode.rs` doc comment + `config/modes.toml`.
- **Mode experiment results:** `data/_kb/mode_experiments.jsonl` + `bun run scripts/mode_compare.ts`.
- **Pathway memory mechanics:** `crates/vectord/src/pathway_memory.rs` + ADR-021 in `docs/DECISIONS.md`.
- **Handover to fresh box:** `docs/MATRIX_AGENT_HANDOVER.md`.
## 1. What the loop is
An autonomous review-and-commit pipeline that:
1. **Scrum master** (`tests/real-world/scrum_master_pipeline.ts`) walks a target-file list, asks a 9-rung escalation ladder of cloud models to produce a forensic audit against PRD + a change proposal doc, retries with learning context until acceptance, emits a structured review row.
2. **Pathway memory** (`crates/vectord/src/pathway_memory.rs`) stores the full backtrack context of each review (attempts, KB chunks, flags, bug fingerprints) indexed by a narrow fingerprint (`task_class + file_prefix + signal_class`). On every new review, it prepends historical bug patterns as a preamble so the reviewer preempts recurrences. Retired pathways auto-exclude themselves from hot-swap eligibility.
3. **Auto-applier** (`tests/real-world/scrum_applier.ts`) filters schema_v4 review rows by gradient_tier + confidence, asks `qwen3-coder:480b` for concrete `old_string/new_string` patches, runs `cargo check --workspace`, commits on green OR reverts on red/warning-count-up/rationale-mismatch.
4. **Observer** (`mcp-server/observer.ts`) receives per-file `/event` emissions, escalates failure clusters to LLM Team via `/v1/chat` with `qwen3-coder:480b`.
5. **Auditor** (`auditor/audit.ts`) external N=3 consensus re-check of scrum findings; writes to `data/_kb/audit_facts.jsonl`.
The guiding principle: **every KB write has a reader, every PR claim is diff-verifiable.**
## 2. The 9-rung ladder (cloud-first, strongest-model-first)
Defined in `tests/real-world/scrum_master_pipeline.ts` at `const LADDER`:
| # | Provider | Model | Role |
|---|---|---|---|
| 1 | ollama_cloud | `kimi-k2:1t` | flagship, 1T params |
| 2 | ollama_cloud | `qwen3-coder:480b` | coding specialist, 480B |
| 3 | ollama_cloud | `deepseek-v3.1:671b` | reasoning, 671B |
| 4 | ollama_cloud | `mistral-large-3:675b` | deep analysis, 675B |
| 5 | ollama_cloud | `gpt-oss:120b` | reliable workhorse |
| 6 | ollama_cloud | `qwen3.5:397b` | dense 397B, final thinker |
| 7 | openrouter | `openai/gpt-oss-120b:free` | free-tier rescue |
| 8 | openrouter | `google/gemma-3-27b-it:free` | fastest rescue |
| 9 | ollama | `qwen3.5:latest` | last-resort local |
**Each attempt is evaluated by `isAcceptable()`** (chars 3800 AND not a malformed JSON-only dump). On reject, the next rung sees a learning preamble with the prior rejection reason.
## 3. Tree-split reducer
Files larger than `FILE_TREE_SPLIT_THRESHOLD = 6000` bytes get chunked into `FILE_SHARD_SIZE = 3500`-byte shards. Each shard gets summarized via a fast rung, summaries are concatenated with internal `§N§` markers, then fed as a SCRATCHPAD to the reviewer. The `§N§` markers are stripped before the reviewer sees the merged context so it cannot claim "(shard 3)" in titles.
Bug regime this fixed: pre-tree-split iters had reviewers claim fields were "missing" because the field was past the 6KB context cutoff, not actually absent.
## 4. Schema v4 KB rows
`data/_kb/scrum_reviews.jsonl` one row per accepted review. Fields:
```json
{
"file": "crates/queryd/src/service.rs",
"reviewed_at": "2026-04-24T11:06:56Z",
"accepted_model": "ollama_cloud/kimi-k2:1t",
"accepted_on_attempt": 1,
"attempts_made": 1,
"tree_split_fired": true,
"suggestions_preview": "<truncated-2000-char>",
"confidences_per_finding": [92, 90, 88, 85, 75],
"confidence_avg": 86,
"confidence_min": 75,
"findings_count": 5,
"gradient_tier": "dry_run", // auto ≥90 / dry_run ≥70 / simulation ≥50 / block <50
"gradient_tier_avg": "dry_run",
"alignment_score": 3, // 1-10 self-rated
"output_format": "forensic_json",
"verdict": "fail", // pass | needs_patch | fail
"critical_failures_count": 3,
"pseudocode_flags_count": 0,
"prd_mismatches_count": 4,
"missing_components_count": 6,
"verified_components_count": 2,
"risk_points_count": 3,
"schema_version": 4,
"scrum_master_reviewed": true,
// ADR-021 fields on pathway trace (NOT this row, see pathway_memory state.json)
"pathway_hot_swap_hit": false,
"pathway_id": null,
"pathway_similarity": null,
"pathway_success_rate": null,
"rungs_saved": 0
}
```
## 5. Applier hardened gates (landed 5e8d87b)
`tests/real-world/scrum_applier.ts` has **5 gates** between emitter output and commit. A patch must pass ALL:
1. **Confidence gate** emitter's self-reported `confidence >= MIN_CONF` (default 90; 85 with relaxed env). Rejected patches log reason `confidence NN < MM`.
2. **Size gate** max 6 lines changed per patch. Prevents cascading cross-file refactors.
3. **Cargo-green gate** `cargo check --workspace` must pass. Red build `git checkout -- file`.
4. **Warning-count gate** workspace baseline warning count measured at start; after patch, new count must be `≤ baseline`. Catches unused-import additions and dead_code-after-allow-removed patterns. **THIS GATE CAUGHT 96b46cd's HashSet noise.**
5. **Rationale-diff token alignment** rationale text must share at least one non-stopword token with the patch's new_string. Catches "Add destructive SQL filter" claims on `use tracing;` diffs.
Plus:
- **Deny-list**: `config/`, `ops/`, `auditor/`, `docs/`, `data/`, `mcp-server/`, `ui/`, `sidecar/`, `scripts/` can't be auto-applied (human review required).
- **Branch guard**: refuses to run on `main`.
- **Dry-run workspace revert**: in `COMMIT=0` mode, file is reverted after check regardless of outcome no state pollution between runs.
Every decision logs to `data/_kb/auto_apply.jsonl` (action: `committed` / `build_red_reverted` / `warnings_increased_reverted` / `rationale_mismatch_reverted` / `all_rejected` / `no_patches` / `dry_run_would_commit`).
## 6. Pathway memory (ADR-021)
**Full spec: `docs/DECISIONS.md` ADR-021. Code: `crates/vectord/src/pathway_memory.rs`.**
Three-layer matrix index for compounding semantic-correctness signal:
### Fingerprint (narrow)
`pathway_id = SHA256(task_class + "|" + file_prefix + "|" + signal_class)` first 2 path segments (`crates/queryd`) so related files in the same crate share pathways.
### Embedding (similarity vector)
32-bucket L2-normalized token hash. Tokens include: task_class, file_path, signal_class, per-attempt model+rung+accepted flag, KB chunk source_docs, observer class, bridge libraries, sub-pipeline calls, **semantic_flags**, and **bug_fingerprints (flag+pattern_key)**.
**TS and Rust implementations byte-match** verified by smoke test showing cosine=1.0 on same input tokens. This is load-bearing for the TS-written traces to be searchable against the Rust-indexed space.
### Hot-swap gate (5-factor AND)
```
narrow_fingerprint_matches
AND audit_consensus.pass != false (null OK during bootstrap)
AND replay_count >= 3 (probation)
AND success_rate >= 0.80
AND NOT retired
AND similarity(query_vec, stored.pathway_vec) >= 0.90
```
Replay bookkeeping: on hot-swap, `replay_count++`; if the recommended model succeeded, `replays_succeeded++`; if `replay_count >= 3 AND success_rate < 0.80` `retired = true` (sticky prevents oscillation on noise).
### Semantic-correctness layer (ADR-021)
Each `PathwayTrace` carries:
- `semantic_flags: Vec<SemanticFlag>` one of 9 variants: `UnitMismatch`, `TypeConfusion`, `NullableConfusion`, `OffByOne`, `StaleReference`, `PseudoImpl`, `DeadCode`, `WarningNoise`, `BoundaryViolation`
- `bug_fingerprints: Vec<BugFingerprint>` `{flag, pattern_key, example, occurrences}` where `pattern_key = "{Flag}:{sorted-top-3-identifiers-joined-by-hyphen}"`. Stable across prose variation.
- `type_hints_used: Vec<TypeHint>` `{source, symbol, type_repr}`. Phase E (not yet populated).
**Pre-review enrichment**: scrum calls `POST /vectors/pathway/bug_fingerprints` with `{task_class, file_path, signal_class, limit}` returns aggregated fingerprints sorted by occurrences descending. If any, a `📚 PATHWAY MEMORY` preamble is prepended to the reviewer prompt with "this file area had these patterns before check for recurrences."
**Post-review extractor** (Phase D, `scrum_master_pipeline.ts`): walks reviewer markdown line-by-line, finds lines containing a `SemanticFlag` variant, extracts identifier-shaped backtick-quoted tokens, filters out flag names + Rust keywords (self/mut/async/etc), sorts and takes top 3, builds `pattern_key = "{Flag}:{tokens}"`.
### HTTP surface (on gateway port 3100)
| Endpoint | Purpose |
|---|---|
| `POST /vectors/pathway/insert` | write a full PathwayTrace |
| `POST /vectors/pathway/query` | hot-swap candidate check (returns `{candidate: null}` or `{candidate: {...}}`) |
| `POST /vectors/pathway/record_replay` | update replay_count + success_rate after hot-swap |
| `GET /vectors/pathway/stats` | totals + reuse_rate + replay_success_rate |
| `POST /vectors/pathway/bug_fingerprints` | aggregated fingerprints by narrow fingerprint (for pre-review preamble) |
### State persistence
`data/_pathway_memory/state.json` JSON dump of all buckets. Loaded at gateway boot (`crates/gateway/src/main.rs` has `pwm.load_from_storage().await`).
## 7. Current state (2026-04-24 end of session)
### Commits on branch `scrum/auto-apply-19814` since iter-5 baseline
| # | SHA | Subject |
|---|---|---|
| 1 | `25ea3de` | observer fix route LLM Team escalation to `/v1/chat` qwen3-coder |
| 2 | `8b77d67` | OpenRouter rescue ladder + tree-split reducer + first auto-applier |
| 3 | `96b46cd` | first auto-applied commit (later found misleading) |
| 4 | `5e8d87b` | cleanup + applier hardening (warning + rationale + dry-run gates) |
| 5 | `9cc0ceb` | P42-002 truth gate into queryd `/sql` + `/paged` paths |
| 6 | `2f8b347` | pathway_memory base (PathwayTrace, hot-swap, 18 tests) |
| 7 | `86901f8` | queryd/delta.rs 6-line unit-mismatch fix |
| 8 | `92df0e9` | ADR-021 spec |
| 9 | `0a0843b` | ADR-021 Phases A+B+C (semantic_flags, prompt tags, preamble endpoint) |
| 10 | `ee31424` | ADR-021 Phase D (fingerprint extractor) |
| 11 | `f4cff66` | Phase D fix strip flag names + Rust keywords from pattern_keys |
### Matrix index state
- **12 pathway traces** in `data/_pathway_memory/state.json`
- **11 distinct bug fingerprints** across 4 Flag categories on `crates/queryd` narrow fingerprint (1 manually seeded + 10 extracted)
- **0 hot-swaps fired** (probation requires 3 replays per pathway; none reached yet)
### Active in-flight
- Iter 9 complete; next iter 10+ will use cleaner fingerprint extractor (`f4cff66`)
- 4 "noisy" pattern_keys from iter-9-file-1 pre-fix run (e.g., `DeadCode:DeadCode`) dormant, won't match future output, acceptable dead entries
### Queued (not yet implemented)
- **Phase E** `type_hints_used` population from `catalogd` column types, Arrow `RecordBatch.schema()`, Rust struct field types. Feeds typed context to reviewer prompt.
- **Auditor pathway audit_consensus wire** activates the strict-audit gate (currently lenient: null bootstraps, only explicit `false` blocks).
- **VCP UI cards** for "top bug fingerprints in last N iters" + "new patterns learned this iter"
## 8. How to run a new iteration
```bash
# Default 3 files (playbook_memory.rs, doc_drift.rs, auditor/audit.ts)
LH_SCRUM_FORENSIC=/home/profit/lakehouse/docs/SCRUM_FORENSIC_PROMPT.md \
LH_SCRUM_PROPOSAL=/home/profit/lakehouse/docs/SCRUM_FIX_WAVE.md \
bun run tests/real-world/scrum_master_pipeline.ts
# Targeted files:
LH_SCRUM_FILES="/home/profit/lakehouse/crates/queryd/src/delta.rs,/home/profit/lakehouse/crates/queryd/src/service.rs" \
LH_SCRUM_FORENSIC=... LH_SCRUM_PROPOSAL=... \
bun run tests/real-world/scrum_master_pipeline.ts
# Dry-run auto-applier against the latest scrum output:
LH_APPLIER_MIN_CONF=85 LH_APPLIER_MAX_FILES=10 \
LH_APPLIER_MODEL=qwen3-coder:480b \
LH_APPLIER_BRANCH=scrum/auto-apply-19814 \
bun run tests/real-world/scrum_applier.ts
# Actually commit (ONLY after dry-run looks clean):
LH_APPLIER_COMMIT=1 LH_APPLIER_MIN_CONF=85 LH_APPLIER_MAX_FILES=10 \
LH_APPLIER_MODEL=qwen3-coder:480b \
LH_APPLIER_BRANCH=scrum/auto-apply-19814 \
bun run tests/real-world/scrum_applier.ts
```
## 9. Verify services before running
```bash
# Gateway (port 3100) — must be up; pathway endpoints are here
curl -s http://localhost:3100/health # "lakehouse ok"
curl -s http://localhost:3100/vectors/pathway/stats # pathway memory totals
# UI (port 3950) — VCP dashboard + /data/pathway_stats aggregation
curl -s http://localhost:3950/data/pathway_stats
# Observer (port 3800) — event receiver + LLM Team escalation
curl -s http://localhost:3800/health 2>/dev/null || true
# Sidecar (port 3200) — Python embed
curl -s http://localhost:3200/health 2>/dev/null || true
# LLM Team (port 5000) — /api/run?mode=extract ONLY registered mode
# (others like code_review/patch/refactor return "Unknown mode")
curl -s http://localhost:5000/health 2>/dev/null || true
```
If gateway missing new routes after code change: `cargo build --release -p gateway && sudo systemctl restart lakehouse.service`.
If UI missing new routes: kill old `bun run ui/server.ts` and restart (not a systemd service right now).
## 10. Where things live (code pointers)
| Concern | File |
|---|---|
| Scrum orchestrator | `tests/real-world/scrum_master_pipeline.ts` |
| Scrum ladder constant | same file, `const LADDER` line ~92 |
| Tree-split reducer | same file, `async function treeSplitFile` |
| Forensic prompt preamble (loaded via env) | `docs/SCRUM_FORENSIC_PROMPT.md` |
| Fix-wave proposal preamble | `docs/SCRUM_FIX_WAVE.md` |
| Scrum iter notes | `docs/SCRUM_LOOP_NOTES.md` |
| Auto-applier | `tests/real-world/scrum_applier.ts` |
| Applier audit trail | `data/_kb/auto_apply.jsonl` |
| Scrum reviews KB | `data/_kb/scrum_reviews.jsonl` |
| Model trust journal | `data/_kb/model_trust.jsonl` |
| Pathway memory module | `crates/vectord/src/pathway_memory.rs` |
| Pathway HTTP handlers | `crates/vectord/src/service.rs` (bottom) |
| Pathway state on disk | `data/_pathway_memory/state.json` |
| VCP UI server | `ui/server.ts` |
| VCP UI client | `ui/ui.js` + `ui/ui.css` + `ui/index.html` |
| Observer | `mcp-server/observer.ts` |
| Auditor | `auditor/audit.ts` |
| LLM Team extract client | `auditor/fact_extractor.ts` |
| ADR-021 spec | `docs/DECISIONS.md` ADR-021 |
## 11. Key memory files a fresh session should read
From `/root/.claude/projects/-home-profit/memory/`:
- `project_scrum_pipeline.md` updated state of the scrum iterations
- `project_first_auto_apply.md` 96b46cd story + cleanup + hardening evidence from iter 7
- `feedback_semantic_correctness_via_matrix.md` J's insight on compounding, the ADR-021 rule
- `feedback_endpoint_probe_discipline.md` GET 405 is not endpoint validation
- `reference_llm_team_modes.md` only `extract` is registered on port 5000
- `feedback_scrum_cloud_first.md` scrum/audit/enrich pipelines use cloud first
- `feedback_cloud_determinism.md` cloud N=3 consensus + qwen3-coder tie-breaker
## 12. Known gotchas
- **Gateway restart needed after Rust route additions.** `sudo systemctl restart lakehouse.service` the service is systemd-managed.
- **UI server needs manual restart** after `ui/server.ts` changes (no systemd unit). Kill old `bun` pid, restart with `bun run ui/server.ts &`.
- **LLM Team mode `code_review` doesn't exist** only `extract` is registered in `/root/llm_team_ui.py`. Don't wire new features to "Unknown mode" endpoints. See `reference_llm_team_modes.md`.
- **OpenRouter free-tier 429s during consensus probes** are normal (rate-limited upstream). In the production ladder they hit as last-resort rescue with seconds-to-minutes gap; different traffic pattern than rapid-fire consensus runs.
- **Openrouter minimax-m2.5:free has a 45s timeout** not in ladder, only for one-off probes.
- **Probation period is 3 replays** before hot-swap can fire. On a fresh install, no hot-swap fires until a pathway has been re-visited 3 times.