Some checks failed
lakehouse/auditor 9 blocking issues: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Splits the existing 04-25/26 section into two waves: - experiment wave (mode-runner build-out, pre-productization) - productization wave (OpenAI-compat, Archon, answers corpus, staffing native runner, multi-corpus + downgrade gate, observer paid escalation, /v1/chat → observer event wiring) Adds verified-live block at the end with the numbers a fresh session needs to anchor on: pathway memory 88 traces / 11 successful replays at 100% (probation gate crossed), strong-model auto-downgrade firing on grok-4.1-fast, and the auditor blind spot at static.ts:117 (now fixed in 107a682). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
336 lines
23 KiB
Markdown
336 lines
23 KiB
Markdown
# Scrum Master Pipeline — Spec + Current State
|
||
|
||
**Status:** Active iteration on branch `scrum/auto-apply-19814` → PR #11 at git.agentview.dev/profit/lakehouse
|
||
**Branch commit head:** see `git log --oneline -1 scrum/auto-apply-19814` (auto-stale; check it)
|
||
|
||
> **2026-04-25 — see also `docs/MATRIX_AGENT_HANDOVER.md`** for the
|
||
> standalone `matrix-agent-validated` repo split + the Ansible playbook
|
||
> that deploys it. Note: VPS at 192.168.1.145 is a TEST VENV ONLY
|
||
> (partial deploy); the real destination is the `matrix-test` Incus
|
||
> container at 10.111.129.50.
|
||
|
||
This doc is the single handoff artifact for the scrum-master + auto-apply + pathway-memory loop. A fresh Claude Code session reading this + `docs/DECISIONS.md` (ADR-020 and ADR-021) + `docs/MATRIX_AGENT_HANDOVER.md` + `MEMORY.md` should have the same context as the session that wrote it.
|
||
|
||
## ▶ Refactor timeline (read in order)
|
||
|
||
The pipeline has been refactored substantially since the 2026-04-24
|
||
baseline below. Read the changes top-down to understand current shape:
|
||
|
||
### 2026-04-23 → 24 (foundation, captured in §1-§12 below)
|
||
- 9-rung cloud ladder + tree-split + adversarial prompt
|
||
- Pathway memory base + ADR-021 semantic-correctness layer
|
||
- Hardened auto-applier (5 gates: confidence/size/cargo/warnings/rationale)
|
||
- Hand-review wire (commit `3f166a5`) — judgment moved out of inner loop
|
||
- Anchor-grounding post-verifier (commit `9cc0ceb` / `9ecc584`)
|
||
- Single-model retry with enrichment (commit `d187bcd`) — stop cascading on quality
|
||
- Unified matrix retriever pulling from ALL KB corpora (commit `a496ced`)
|
||
- Paid OpenRouter ladder + Kimi K2.6 + Gemini 2.5 (commit `4ac5656`)
|
||
- Goal-driven autonomous loop harness (commit `e79e51e`)
|
||
|
||
### 2026-04-25 → 26 morning (mode-runner experiment wave)
|
||
- **Observer health-probe TypeConfusion fix** (`54689d5`) — `r.json()` on text/plain `/health` was crash-looping the observer; sealed in pathway_memory as `TypeConfusion:fetch-health-json`.
|
||
- **Adjacency-pollution relevance filter** (`0115a60`) — observer `/relevance` endpoint + scrum wiring (`LH_RELEVANCE_FILTER` / `LH_RELEVANCE_THRESHOLD`). Drops chunks about symbols the focus file IMPORTS but doesn't define.
|
||
- **Audit-consensus → retire wire** (`626f18d`) — when observer rejects a hot-swap-recommended attempt, immediately call `/vectors/pathway/retire` on the trace. `HotSwapCandidate` gained `trace_uid` for single-trace precision. Confidence ≥0.7 gate avoids retiring on heuristic-fallback verdicts.
|
||
- **`/v1/mode` router phase 1** (`d277efb`) — task_class → mode/model decision endpoint with `config/modes.toml`. Decision-only; doesn't execute.
|
||
- **Native enrichment runner** (`86f63a0`) — `codereview_lakehouse` mode that COMPOSES every primitive (focus file + bug fingerprints + relevance-filtered matrix + adversarial framing) into ONE prompt for one-shot success. `POST /v1/mode/execute`. Modes-as-prompt-molders, not model-pickers — see ★ Insight from session 2026-04-26.
|
||
- **Parameterized runner + 5 experiment modes** (`7c47734`) — `codereview_lakehouse|null|isolation|matrix_only|playbook_only`. Each isolates one architectural axis. `scripts/mode_experiment.ts` sweeps files × modes; `scripts/mode_compare.ts` aggregates with grounding check (catches confabulation by comparing cited symbols to real file content).
|
||
- **Scrum mode-runner fast path** (`7c47734`) — gated by `LH_USE_MODE_RUNNER=1`, scrum tries `/v1/mode/execute` BEFORE the 9-rung ladder. Falls through to ladder if response < `LH_MODE_MIN_CHARS` or anything errors. Off by default until A/B-validated.
|
||
- **Mode-compare grounding column** (`52bb216`) — emoji-tolerant section regex + control-flag tagging. Caught `playbook_only` confabulation that hand-grading also found.
|
||
|
||
### 2026-04-26 evening (productization wave)
|
||
- **Override knobs + staffing native runner** (`56bf30c`) — pass 2/3/4 harnesses, mode runner now serves `staffing.fill` task class natively, not just code review.
|
||
- **Multi-corpus runner + variance harness + strong-model downgrade gate** (`2dbc8db`) — three corpora (arch / findings / symbols) selectable per mode. Paid models auto-downgrade: skip matrix corpus, isolation framing only. Driven by `feedback_composed_corpora_anti_additive.md` (composed corpora LOST 5/5 vs isolation on grok-4.1-fast, p=0.031).
|
||
- **OpenAI-compat alias + smart provider routing** (`3a0b37e`) — gateway is now a drop-in middleware for any OpenAI SDK consumer. Three routing flavors verified via `/tmp/archon-test/sdk-test.ts`: `openai/gpt-4o-mini`, bare `gpt-4o-mini`, `x-ai/grok-4.1-fast`.
|
||
- **OpenAI multimodal content shape** (`540a9a2`) — accepts `content: [...]` array-of-parts.
|
||
- **`/v1/chat` fires observer event** (`d1d97a0`) — every chat call now lands both Langfuse trace AND observer `/event` (was Langfuse only).
|
||
- **Archon workflow** (`69919d9`) — `.archon/workflows/lakehouse-architect-review.yaml`. 3 Pi nodes (shape → weakness → improvement) using `openrouter/x-ai/grok-4.1-fast` through the gateway.
|
||
- **Observer KB enrichment preamble** (`d9bd4c9`) — observer prepends KB context to escalation prompts (was raw failure cluster).
|
||
- **Observer escalation → paid OpenRouter** (`340fca2`) — `deepseek-v3.1-terminus` instead of free-tier rescue. Verified: diagnoses cite architectural patterns (circuit breaker, adapter files) instead of generic timeouts.
|
||
- **Gold-standard answer corpus** (`0844206`) — `scripts/build_answers_corpus.ts` indexes `lakehouse_answers_v1` from `scrum_reviews.jsonl + observer_escalations.jsonl`. Doc ID prefixes (`review:` vs `escalation:`) let consumers same-file-gate or broaden. Auto-rebuilds from scrum epilogue (`LH_SCRUM_SKIP_ANSWERS_REBUILD=1` to disable). Observer `buildKbPreamble` now blends three sources (pathway + arch + answers); preamble grew 416 → 727 chars.
|
||
|
||
### Verified live state (2026-04-26 ~23:30)
|
||
- Pathway memory: **88 traces, 11/11 successful replays = 100%** — hot-swap probation gate crossed; live recommendations firing.
|
||
- Strong-model auto-downgrade verified: scrum on grok-4.1-fast → matrix corpus dropped, isolation mode auto-selected, 3 files accepted on attempt 1, ~27s each.
|
||
- Auditor verdict on PR #11 head `0844206`: **block** on 8 false positives — `auditor/checks/static.ts:117` "field added but never read" check doesn't follow serde derives. Fix is in the auditor, not the code.
|
||
|
||
### Verified architectural insights (2026-04-26 experiment)
|
||
- `codereview_lakehouse` produces 100% grounded findings, beats every challenger.
|
||
- `codereview_playbook_only` (pathway-only, no file content) confabulates ~50% of findings — keep as control, NEVER as recommendation.
|
||
- `codereview_null` (no enrichment, generic prompt) produces 0 ranked findings — adversarial framing is load-bearing.
|
||
- Matrix corpus contributes ~2 grounded findings vs isolation. Small but real.
|
||
|
||
### Where to read what
|
||
- **Loop architecture (this doc, §1-§12):** original 2026-04-24 design.
|
||
- **Modes-as-enrichment vision:** `crates/gateway/src/v1/mode.rs` doc comment + `config/modes.toml`.
|
||
- **Mode experiment results:** `data/_kb/mode_experiments.jsonl` + `bun run scripts/mode_compare.ts`.
|
||
- **Pathway memory mechanics:** `crates/vectord/src/pathway_memory.rs` + ADR-021 in `docs/DECISIONS.md`.
|
||
- **Handover to fresh box:** `docs/MATRIX_AGENT_HANDOVER.md`.
|
||
|
||
## 1. What the loop is
|
||
|
||
An autonomous review-and-commit pipeline that:
|
||
|
||
1. **Scrum master** (`tests/real-world/scrum_master_pipeline.ts`) — walks a target-file list, asks a 9-rung escalation ladder of cloud models to produce a forensic audit against PRD + a change proposal doc, retries with learning context until acceptance, emits a structured review row.
|
||
2. **Pathway memory** (`crates/vectord/src/pathway_memory.rs`) — stores the full backtrack context of each review (attempts, KB chunks, flags, bug fingerprints) indexed by a narrow fingerprint (`task_class + file_prefix + signal_class`). On every new review, it prepends historical bug patterns as a preamble so the reviewer preempts recurrences. Retired pathways auto-exclude themselves from hot-swap eligibility.
|
||
3. **Auto-applier** (`tests/real-world/scrum_applier.ts`) — filters schema_v4 review rows by gradient_tier + confidence, asks `qwen3-coder:480b` for concrete `old_string/new_string` patches, runs `cargo check --workspace`, commits on green OR reverts on red/warning-count-up/rationale-mismatch.
|
||
4. **Observer** (`mcp-server/observer.ts`) — receives per-file `/event` emissions, escalates failure clusters to LLM Team via `/v1/chat` with `qwen3-coder:480b`.
|
||
5. **Auditor** (`auditor/audit.ts`) — external N=3 consensus re-check of scrum findings; writes to `data/_kb/audit_facts.jsonl`.
|
||
|
||
The guiding principle: **every KB write has a reader, every PR claim is diff-verifiable.**
|
||
|
||
## 2. The 9-rung ladder (cloud-first, strongest-model-first)
|
||
|
||
Defined in `tests/real-world/scrum_master_pipeline.ts` at `const LADDER`:
|
||
|
||
| # | Provider | Model | Role |
|
||
|---|---|---|---|
|
||
| 1 | ollama_cloud | `kimi-k2:1t` | flagship, 1T params |
|
||
| 2 | ollama_cloud | `qwen3-coder:480b` | coding specialist, 480B |
|
||
| 3 | ollama_cloud | `deepseek-v3.1:671b` | reasoning, 671B |
|
||
| 4 | ollama_cloud | `mistral-large-3:675b` | deep analysis, 675B |
|
||
| 5 | ollama_cloud | `gpt-oss:120b` | reliable workhorse |
|
||
| 6 | ollama_cloud | `qwen3.5:397b` | dense 397B, final thinker |
|
||
| 7 | openrouter | `openai/gpt-oss-120b:free` | free-tier rescue |
|
||
| 8 | openrouter | `google/gemma-3-27b-it:free` | fastest rescue |
|
||
| 9 | ollama | `qwen3.5:latest` | last-resort local |
|
||
|
||
**Each attempt is evaluated by `isAcceptable()`** (chars ≥ 3800 AND not a malformed JSON-only dump). On reject, the next rung sees a learning preamble with the prior rejection reason.
|
||
|
||
## 3. Tree-split reducer
|
||
|
||
Files larger than `FILE_TREE_SPLIT_THRESHOLD = 6000` bytes get chunked into `FILE_SHARD_SIZE = 3500`-byte shards. Each shard gets summarized via a fast rung, summaries are concatenated with internal `§N§` markers, then fed as a SCRATCHPAD to the reviewer. The `§N§` markers are stripped before the reviewer sees the merged context so it cannot claim "(shard 3)" in titles.
|
||
|
||
Bug regime this fixed: pre-tree-split iters had reviewers claim fields were "missing" because the field was past the 6KB context cutoff, not actually absent.
|
||
|
||
## 4. Schema v4 KB rows
|
||
|
||
`data/_kb/scrum_reviews.jsonl` — one row per accepted review. Fields:
|
||
|
||
```json
|
||
{
|
||
"file": "crates/queryd/src/service.rs",
|
||
"reviewed_at": "2026-04-24T11:06:56Z",
|
||
"accepted_model": "ollama_cloud/kimi-k2:1t",
|
||
"accepted_on_attempt": 1,
|
||
"attempts_made": 1,
|
||
"tree_split_fired": true,
|
||
"suggestions_preview": "<truncated-2000-char>",
|
||
"confidences_per_finding": [92, 90, 88, 85, 75],
|
||
"confidence_avg": 86,
|
||
"confidence_min": 75,
|
||
"findings_count": 5,
|
||
"gradient_tier": "dry_run", // auto ≥90 / dry_run ≥70 / simulation ≥50 / block <50
|
||
"gradient_tier_avg": "dry_run",
|
||
"alignment_score": 3, // 1-10 self-rated
|
||
"output_format": "forensic_json",
|
||
"verdict": "fail", // pass | needs_patch | fail
|
||
"critical_failures_count": 3,
|
||
"pseudocode_flags_count": 0,
|
||
"prd_mismatches_count": 4,
|
||
"missing_components_count": 6,
|
||
"verified_components_count": 2,
|
||
"risk_points_count": 3,
|
||
"schema_version": 4,
|
||
"scrum_master_reviewed": true,
|
||
// ADR-021 fields on pathway trace (NOT this row, see pathway_memory state.json)
|
||
"pathway_hot_swap_hit": false,
|
||
"pathway_id": null,
|
||
"pathway_similarity": null,
|
||
"pathway_success_rate": null,
|
||
"rungs_saved": 0
|
||
}
|
||
```
|
||
|
||
## 5. Applier hardened gates (landed 5e8d87b)
|
||
|
||
`tests/real-world/scrum_applier.ts` has **5 gates** between emitter output and commit. A patch must pass ALL:
|
||
|
||
1. **Confidence gate** — emitter's self-reported `confidence >= MIN_CONF` (default 90; 85 with relaxed env). Rejected patches log reason `confidence NN < MM`.
|
||
2. **Size gate** — max 6 lines changed per patch. Prevents cascading cross-file refactors.
|
||
3. **Cargo-green gate** — `cargo check --workspace` must pass. Red build → `git checkout -- file`.
|
||
4. **Warning-count gate** — workspace baseline warning count measured at start; after patch, new count must be `≤ baseline`. Catches unused-import additions and dead_code-after-allow-removed patterns. **THIS GATE CAUGHT 96b46cd's HashSet noise.**
|
||
5. **Rationale-diff token alignment** — rationale text must share at least one non-stopword token with the patch's new_string. Catches "Add destructive SQL filter" claims on `use tracing;` diffs.
|
||
|
||
Plus:
|
||
- **Deny-list**: `config/`, `ops/`, `auditor/`, `docs/`, `data/`, `mcp-server/`, `ui/`, `sidecar/`, `scripts/` can't be auto-applied (human review required).
|
||
- **Branch guard**: refuses to run on `main`.
|
||
- **Dry-run workspace revert**: in `COMMIT=0` mode, file is reverted after check regardless of outcome — no state pollution between runs.
|
||
|
||
Every decision logs to `data/_kb/auto_apply.jsonl` (action: `committed` / `build_red_reverted` / `warnings_increased_reverted` / `rationale_mismatch_reverted` / `all_rejected` / `no_patches` / `dry_run_would_commit`).
|
||
|
||
## 6. Pathway memory (ADR-021)
|
||
|
||
**Full spec: `docs/DECISIONS.md` ADR-021. Code: `crates/vectord/src/pathway_memory.rs`.**
|
||
|
||
Three-layer matrix index for compounding semantic-correctness signal:
|
||
|
||
### Fingerprint (narrow)
|
||
`pathway_id = SHA256(task_class + "|" + file_prefix + "|" + signal_class)` — first 2 path segments (`crates/queryd`) so related files in the same crate share pathways.
|
||
|
||
### Embedding (similarity vector)
|
||
32-bucket L2-normalized token hash. Tokens include: task_class, file_path, signal_class, per-attempt model+rung+accepted flag, KB chunk source_docs, observer class, bridge libraries, sub-pipeline calls, **semantic_flags**, and **bug_fingerprints (flag+pattern_key)**.
|
||
|
||
**TS and Rust implementations byte-match** — verified by smoke test showing cosine=1.0 on same input tokens. This is load-bearing for the TS-written traces to be searchable against the Rust-indexed space.
|
||
|
||
### Hot-swap gate (5-factor AND)
|
||
```
|
||
narrow_fingerprint_matches
|
||
AND audit_consensus.pass != false (null OK during bootstrap)
|
||
AND replay_count >= 3 (probation)
|
||
AND success_rate >= 0.80
|
||
AND NOT retired
|
||
AND similarity(query_vec, stored.pathway_vec) >= 0.90
|
||
```
|
||
|
||
Replay bookkeeping: on hot-swap, `replay_count++`; if the recommended model succeeded, `replays_succeeded++`; if `replay_count >= 3 AND success_rate < 0.80` → `retired = true` (sticky — prevents oscillation on noise).
|
||
|
||
### Semantic-correctness layer (ADR-021)
|
||
Each `PathwayTrace` carries:
|
||
- `semantic_flags: Vec<SemanticFlag>` — one of 9 variants: `UnitMismatch`, `TypeConfusion`, `NullableConfusion`, `OffByOne`, `StaleReference`, `PseudoImpl`, `DeadCode`, `WarningNoise`, `BoundaryViolation`
|
||
- `bug_fingerprints: Vec<BugFingerprint>` — `{flag, pattern_key, example, occurrences}` where `pattern_key = "{Flag}:{sorted-top-3-identifiers-joined-by-hyphen}"`. Stable across prose variation.
|
||
- `type_hints_used: Vec<TypeHint>` — `{source, symbol, type_repr}`. Phase E (not yet populated).
|
||
|
||
**Pre-review enrichment**: scrum calls `POST /vectors/pathway/bug_fingerprints` with `{task_class, file_path, signal_class, limit}` — returns aggregated fingerprints sorted by occurrences descending. If any, a `📚 PATHWAY MEMORY` preamble is prepended to the reviewer prompt with "this file area had these patterns before — check for recurrences."
|
||
|
||
**Post-review extractor** (Phase D, `scrum_master_pipeline.ts`): walks reviewer markdown line-by-line, finds lines containing a `SemanticFlag` variant, extracts identifier-shaped backtick-quoted tokens, filters out flag names + Rust keywords (self/mut/async/etc), sorts and takes top 3, builds `pattern_key = "{Flag}:{tokens}"`.
|
||
|
||
### HTTP surface (on gateway port 3100)
|
||
| Endpoint | Purpose |
|
||
|---|---|
|
||
| `POST /vectors/pathway/insert` | write a full PathwayTrace |
|
||
| `POST /vectors/pathway/query` | hot-swap candidate check (returns `{candidate: null}` or `{candidate: {...}}`) |
|
||
| `POST /vectors/pathway/record_replay` | update replay_count + success_rate after hot-swap |
|
||
| `GET /vectors/pathway/stats` | totals + reuse_rate + replay_success_rate |
|
||
| `POST /vectors/pathway/bug_fingerprints` | aggregated fingerprints by narrow fingerprint (for pre-review preamble) |
|
||
|
||
### State persistence
|
||
`data/_pathway_memory/state.json` — JSON dump of all buckets. Loaded at gateway boot (`crates/gateway/src/main.rs` has `pwm.load_from_storage().await`).
|
||
|
||
## 7. Current state (2026-04-24 end of session)
|
||
|
||
### Commits on branch `scrum/auto-apply-19814` since iter-5 baseline
|
||
|
||
| # | SHA | Subject |
|
||
|---|---|---|
|
||
| 1 | `25ea3de` | observer fix — route LLM Team escalation to `/v1/chat` qwen3-coder |
|
||
| 2 | `8b77d67` | OpenRouter rescue ladder + tree-split reducer + first auto-applier |
|
||
| 3 | `96b46cd` | first auto-applied commit (later found misleading) |
|
||
| 4 | `5e8d87b` | cleanup + applier hardening (warning + rationale + dry-run gates) |
|
||
| 5 | `9cc0ceb` | P42-002 — truth gate into queryd `/sql` + `/paged` paths |
|
||
| 6 | `2f8b347` | pathway_memory base (PathwayTrace, hot-swap, 18 tests) |
|
||
| 7 | `86901f8` | queryd/delta.rs 6-line unit-mismatch fix |
|
||
| 8 | `92df0e9` | ADR-021 spec |
|
||
| 9 | `0a0843b` | ADR-021 Phases A+B+C (semantic_flags, prompt tags, preamble endpoint) |
|
||
| 10 | `ee31424` | ADR-021 Phase D (fingerprint extractor) |
|
||
| 11 | `f4cff66` | Phase D fix — strip flag names + Rust keywords from pattern_keys |
|
||
|
||
### Matrix index state
|
||
- **12 pathway traces** in `data/_pathway_memory/state.json`
|
||
- **11 distinct bug fingerprints** across 4 Flag categories on `crates/queryd` narrow fingerprint (1 manually seeded + 10 extracted)
|
||
- **0 hot-swaps fired** (probation requires ≥3 replays per pathway; none reached yet)
|
||
|
||
### Active in-flight
|
||
- Iter 9 complete; next iter 10+ will use cleaner fingerprint extractor (`f4cff66`)
|
||
- 4 "noisy" pattern_keys from iter-9-file-1 pre-fix run (e.g., `DeadCode:DeadCode`) — dormant, won't match future output, acceptable dead entries
|
||
|
||
### Queued (not yet implemented)
|
||
- **Phase E** — `type_hints_used` population from `catalogd` column types, Arrow `RecordBatch.schema()`, Rust struct field types. Feeds typed context to reviewer prompt.
|
||
- **Auditor → pathway audit_consensus wire** — activates the strict-audit gate (currently lenient: null bootstraps, only explicit `false` blocks).
|
||
- **VCP UI cards** for "top bug fingerprints in last N iters" + "new patterns learned this iter"
|
||
|
||
## 8. How to run a new iteration
|
||
|
||
```bash
|
||
# Default 3 files (playbook_memory.rs, doc_drift.rs, auditor/audit.ts)
|
||
LH_SCRUM_FORENSIC=/home/profit/lakehouse/docs/SCRUM_FORENSIC_PROMPT.md \
|
||
LH_SCRUM_PROPOSAL=/home/profit/lakehouse/docs/SCRUM_FIX_WAVE.md \
|
||
bun run tests/real-world/scrum_master_pipeline.ts
|
||
|
||
# Targeted files:
|
||
LH_SCRUM_FILES="/home/profit/lakehouse/crates/queryd/src/delta.rs,/home/profit/lakehouse/crates/queryd/src/service.rs" \
|
||
LH_SCRUM_FORENSIC=... LH_SCRUM_PROPOSAL=... \
|
||
bun run tests/real-world/scrum_master_pipeline.ts
|
||
|
||
# Dry-run auto-applier against the latest scrum output:
|
||
LH_APPLIER_MIN_CONF=85 LH_APPLIER_MAX_FILES=10 \
|
||
LH_APPLIER_MODEL=qwen3-coder:480b \
|
||
LH_APPLIER_BRANCH=scrum/auto-apply-19814 \
|
||
bun run tests/real-world/scrum_applier.ts
|
||
|
||
# Actually commit (ONLY after dry-run looks clean):
|
||
LH_APPLIER_COMMIT=1 LH_APPLIER_MIN_CONF=85 LH_APPLIER_MAX_FILES=10 \
|
||
LH_APPLIER_MODEL=qwen3-coder:480b \
|
||
LH_APPLIER_BRANCH=scrum/auto-apply-19814 \
|
||
bun run tests/real-world/scrum_applier.ts
|
||
```
|
||
|
||
## 9. Verify services before running
|
||
|
||
```bash
|
||
# Gateway (port 3100) — must be up; pathway endpoints are here
|
||
curl -s http://localhost:3100/health # "lakehouse ok"
|
||
curl -s http://localhost:3100/vectors/pathway/stats # pathway memory totals
|
||
|
||
# UI (port 3950) — VCP dashboard + /data/pathway_stats aggregation
|
||
curl -s http://localhost:3950/data/pathway_stats
|
||
|
||
# Observer (port 3800) — event receiver + LLM Team escalation
|
||
curl -s http://localhost:3800/health 2>/dev/null || true
|
||
|
||
# Sidecar (port 3200) — Python embed
|
||
curl -s http://localhost:3200/health 2>/dev/null || true
|
||
|
||
# LLM Team (port 5000) — /api/run?mode=extract ONLY registered mode
|
||
# (others like code_review/patch/refactor return "Unknown mode")
|
||
curl -s http://localhost:5000/health 2>/dev/null || true
|
||
```
|
||
|
||
If gateway missing new routes after code change: `cargo build --release -p gateway && sudo systemctl restart lakehouse.service`.
|
||
|
||
If UI missing new routes: kill old `bun run ui/server.ts` and restart (not a systemd service right now).
|
||
|
||
## 10. Where things live (code pointers)
|
||
|
||
| Concern | File |
|
||
|---|---|
|
||
| Scrum orchestrator | `tests/real-world/scrum_master_pipeline.ts` |
|
||
| Scrum ladder constant | same file, `const LADDER` line ~92 |
|
||
| Tree-split reducer | same file, `async function treeSplitFile` |
|
||
| Forensic prompt preamble (loaded via env) | `docs/SCRUM_FORENSIC_PROMPT.md` |
|
||
| Fix-wave proposal preamble | `docs/SCRUM_FIX_WAVE.md` |
|
||
| Scrum iter notes | `docs/SCRUM_LOOP_NOTES.md` |
|
||
| Auto-applier | `tests/real-world/scrum_applier.ts` |
|
||
| Applier audit trail | `data/_kb/auto_apply.jsonl` |
|
||
| Scrum reviews KB | `data/_kb/scrum_reviews.jsonl` |
|
||
| Model trust journal | `data/_kb/model_trust.jsonl` |
|
||
| Pathway memory module | `crates/vectord/src/pathway_memory.rs` |
|
||
| Pathway HTTP handlers | `crates/vectord/src/service.rs` (bottom) |
|
||
| Pathway state on disk | `data/_pathway_memory/state.json` |
|
||
| VCP UI server | `ui/server.ts` |
|
||
| VCP UI client | `ui/ui.js` + `ui/ui.css` + `ui/index.html` |
|
||
| Observer | `mcp-server/observer.ts` |
|
||
| Auditor | `auditor/audit.ts` |
|
||
| LLM Team extract client | `auditor/fact_extractor.ts` |
|
||
| ADR-021 spec | `docs/DECISIONS.md` ADR-021 |
|
||
|
||
## 11. Key memory files a fresh session should read
|
||
|
||
From `/root/.claude/projects/-home-profit/memory/`:
|
||
|
||
- `project_scrum_pipeline.md` — updated state of the scrum iterations
|
||
- `project_first_auto_apply.md` — 96b46cd story + cleanup + hardening evidence from iter 7
|
||
- `feedback_semantic_correctness_via_matrix.md` — J's insight on compounding, the ADR-021 rule
|
||
- `feedback_endpoint_probe_discipline.md` — GET 405 is not endpoint validation
|
||
- `reference_llm_team_modes.md` — only `extract` is registered on port 5000
|
||
- `feedback_scrum_cloud_first.md` — scrum/audit/enrich pipelines use cloud first
|
||
- `feedback_cloud_determinism.md` — cloud N=3 consensus + qwen3-coder tie-breaker
|
||
|
||
## 12. Known gotchas
|
||
|
||
- **Gateway restart needed after Rust route additions.** `sudo systemctl restart lakehouse.service` — the service is systemd-managed.
|
||
- **UI server needs manual restart** after `ui/server.ts` changes (no systemd unit). Kill old `bun` pid, restart with `bun run ui/server.ts &`.
|
||
- **LLM Team mode `code_review` doesn't exist** — only `extract` is registered in `/root/llm_team_ui.py`. Don't wire new features to "Unknown mode" endpoints. See `reference_llm_team_modes.md`.
|
||
- **OpenRouter free-tier 429s during consensus probes** are normal (rate-limited upstream). In the production ladder they hit as last-resort rescue with seconds-to-minutes gap; different traffic pattern than rapid-fire consensus runs.
|
||
- **Openrouter minimax-m2.5:free has a 45s timeout** — not in ladder, only for one-off probes.
|
||
- **Probation period is 3 replays** before hot-swap can fire. On a fresh install, no hot-swap fires until a pathway has been re-visited ≥3 times.
|