271 Commits

Author SHA1 Message Date
root
d9bd4c9bdf observer: KB enrichment preamble before failure-cluster escalation
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
escalateFailureClusterToLLMTeam now calls a new buildKbPreamble()
that mirrors what scrum_master_pipeline does on every per-file review:
queries /vectors/pathway/bug_fingerprints + /vectors/search against the
lakehouse_arch_v1 corpus, then asks local qwen3.5:latest (provider=ollama)
to synthesize a tight briefing. The synthesized preamble prepends the
existing escalation prompt so the cloud reviewer sees historical
context the same way scrum reviewers do.

Reuses existing KB primitives — no new corpora, no new endpoints, no
new abstractions. Same code path scrum already exercises 3+ times per
review; observer joins the same compounding loop.

Audit row gains kb_preamble_chars so we can later track enrichment
yield per escalation. Empty preamble (both fingerprints + matrix
return nothing) → empty string, prompt unchanged.

Verified: qwen3.5:latest synthesis fires for every escalation with
non-empty matrix hits (gateway log: 445→72 tokens, 3.1s). Matrix
retrieval correctly surfaces PRD Phase 40/44 chunks for chat_completion
clusters. Pathway memory stays consistent with scrum (84→87 traces);
chat_completion task_class doesn't have fingerprints yet — graceful.

Local-model synthesis was J's explicit ask: compress the raw bundle
before the cloud call so the briefing is actionable, not a dump.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 18:36:19 -05:00
root
69919d9d57 .archon: add lakehouse-architect-review workflow
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Demonstrates Archon → Pi → our gateway → OpenRouter end-to-end on
Lakehouse code. Three Pi nodes (shape, weakness, improvement) — each
fires a /v1/chat/completions call that lands a Langfuse trace AND an
observer event for KB-consolidation parity with scrum.

Run:
  ARCHON_SUPPRESS_NESTED_CLAUDE_WARNING=1 \
  PI_OPENROUTER_BASE_URL=http://localhost:3100/v1 \
  OPENROUTER_API_KEY=sk-anything \
  archon workflow run lakehouse-architect-review --no-worktree

Verified 2026-04-26 — 3 grok-4.1-fast calls, 14.6s total, observer ring
delta=3, three Langfuse traces. Read-only (allowed_tools: [read]) so
running it doesn't mutate the repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 18:05:43 -05:00
root
d1d97a045b v1: fire observer /event from /v1/chat alongside Langfuse trace
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Observer at :3800 already collects scrum + scenario events into a ring
buffer that pathway-memory + KB consolidation read from. /v1/chat now
posts a lightweight {endpoint, source:"v1.chat", input_summary,
output_summary, success, duration_ms} event there too — fire-and-forget
tokio::spawn, observer-down doesn't block the chat response.

Now any tool routed through our gateway (Pi CLI, Archon, openai SDK
clients, langchain-js) shows up in the same ring buffer the scrum loop
reads, ready for the same KB-consolidation analysis. Independent of the
existing langfuse-bridge that polls Langfuse — this path is immediate.

Verified: GET /stats shows {by_source: {v1.chat: N}} grows by 1 per
chat call, both for direct curl and for Pi CLI invocations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 18:01:52 -05:00
root
540a9a27ee v1: accept OpenAI multimodal content shape (array-of-parts)
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Modern OpenAI clients (pi-ai, openai SDK 6.x, langchain-js, the official
agents) send `messages[].content` as an array of content parts:
`[{type:"text", text:"..."}, {type:"image_url", ...}]`. Our gateway
typed `content` as plain `String` and 422'd those calls.

Fix: `Message.content` is now `serde_json::Value` so requests
deserialize regardless of shape. `Message::text()` flattens
content-parts arrays (concat'd `text` fields, non-text parts skipped)
for places that need a plain string — Ollama prompt assembly, char
counts, the assistant's own response synthesis. `Message::new_text()`
constructs string-content messages without writing the wrapper at
each call site. Forwarders (openrouter) clone content through
verbatim so providers see exactly what the client sent.

Verified end-to-end: Pi CLI (`pi --print --provider openrouter`)
landed a clean 1902-token request through `/v1/chat/completions`,
routed to OpenRouter as `openai/gpt-oss-120b:free`, response in
1.62s, Langfuse trace `v1.chat:openrouter` recorded with provider
tag. Same path that any tool using the official openai SDK takes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 17:56:46 -05:00
root
3a0b37ed93 v1: OpenAI-compat alias + smart provider routing — gateway is now drop-in middleware
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
/v1/chat/completions route alias (same handler as /chat) lets any tool
using the official `openai` SDK adopt the gateway via OPENAI_BASE_URL
alone — no custom provider field needed.

resolve_provider() extended:
- bare `vendor/model` (slash) → openrouter (catches x-ai/grok-4.1-fast,
  moonshotai/kimi-k2, deepseek/deepseek-v4-flash, openai/gpt-oss-120b:free)
- bare vendor model names (no slash, no colon) get auto-prefixed:
  gpt-* / o1-* / o3-* / o4-* → openai/<name>  (OpenRouter form)
  claude-* → anthropic/<name>
  grok-* → x-ai/<name>
  Then routed to openrouter. Ollama models (with colon, no slash) keep
  default routing. Tools like pi-ai validate against an OpenAI-style
  catalog and send bare names — this lets them flow through cleanly.

Verified end-to-end:
- curl POST /v1/chat/completions {model: "gpt-4o-mini", ...} → 200,
  routed to openrouter as openai/gpt-4o-mini
- openai SDK with baseURL=http://localhost:3100/v1 → 3 model variants all
  succeed (openai/gpt-4o-mini, gpt-4o-mini, x-ai/grok-4.1-fast)
- Langfuse traces fire automatically on every call
  (v1.chat:openrouter, provider tagged in metadata)

scripts/mode_pass5_variance_paid.ts gains LH_CONDITIONS env so subset
runs (e.g. just isolation vs composed) take half the latency.

Archon-on-Lakehouse integration: gateway side is done. Pi-ai's
openai-responses backend uses /v1/responses (not /chat/completions) and
its openrouter backend appears to bail in client-side validation before
sending. Patching Pi locally to override baseUrl works for arch but the
harness still rejects — needs more work in a follow-up. Direct openai
SDK path (langchain-js / agents / patched Pi) works today.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 17:49:37 -05:00
root
2dbc8dbc83 v1/mode: model-aware enrichment downgrade + 3 corpora + variance harness
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Pass 5 (5 reps × 4 conditions × 1 file on grok-4.1-fast) showed composing
matrix corpora is anti-additive on strong models — composed lakehouse_arch
+ symbols LOST 5/5 head-to-head vs codereview_isolation (Δ −1.8 grounded
findings, p=0.031). Default flips to isolation; matrix path now auto-
downgrades when the resolved model is strong.

Mode runner:
- matrix_corpus is Vec<String> (string OR array via deserialize_string_or_vec)
- top_k=6 from each corpus, merge by score, take top 8 globally
- chunk tag prefers doc_id over source so reviewer sees [adr:009] vs [lakehouse_arch]
- is_weak_model() gate auto-downgrades codereview_lakehouse → codereview_isolation
  for strong models (default-strong; weak = :free suffix or local last-resort)
- LH_FORCE_FULL_ENRICHMENT=1 bypasses for diagnostic runs
- EnrichmentSources.downgraded_from records when the gate fires

Three corpora indexed via /vectors/index (5849 chunks total):
- lakehouse_arch_v1 — ADRs + phases + PRD + scrum spec (93 docs, 2119 chunks)
- scrum_findings_v1 — past scrum_reviews.jsonl (168 docs, 1260 chunks; EXCLUDED
  from defaults — 24% out-of-bounds line citations from cross-file drift)
- lakehouse_symbols_v1 — regex-extracted pub items + /// docs (656 docs, 2470 chunks)

Experiment infra:
- scripts/build_*_corpus.ts — re-runnable when source content changes
- scripts/mode_pass5_variance_paid.ts — N reps × M conditions on one file
- scripts/mode_pass5_summarize.ts — mean ± σ + head-to-head, parser handles
  numbered + path-with-line + path-with-symbol finding tables
- scripts/mode_compare.ts — groups by mode|corpus when sweeps span corpora
- scripts/mode_experiment.ts — default model bumped to x-ai/grok-4.1-fast,
  --corpus flag for per-call override

Decisions + open follow-ups: docs/MODE_RUNNER_TUNING_PLAN.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 17:29:17 -05:00
root
56bf30cfd8 v1/mode: override knobs + staffing native runner + pass 2/3/4 harnesses
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Setup for the corpus-tightening experiment sweep (J 2026-04-26 — "now
is the only cheap window before the corpus gets large and refactoring
costs go up").

Override params on /v1/mode/execute (additive — old callers unaffected):
  force_matrix_corpus      — Pass 2: try alternate corpora per call
  force_relevance_threshold — Pass 2: sweep filter strictness
  force_temperature         — Pass 3: variance test

New native mode `staffing_inference_lakehouse` (Pass 4):
  - Same composer architecture as codereview_lakehouse
  - Staffing framing: coordinator producing fillable|contingent|
    unfillable verdict + ranked candidate list with playbook citations
  - matrix_corpus = workers_500k_v8
  - Validates that modes-as-prompt-molders generalizes beyond code
  - Framing explicitly says "do NOT fabricate workers" — the staffing
    analog of the lakehouse mode's symbol-grounding requirement

Three sweep harnesses:
  scripts/mode_pass2_corpus_sweep.ts — 4 corpora × 4 thresholds × 5 files
  scripts/mode_pass3_variance.ts     — 3 files × 3 temps × 5 reps
  scripts/mode_pass4_staffing.ts     — 5 fill requests through staffing mode

Each appends per-call rows to data/_kb/mode_experiments.jsonl which
mode_compare.ts already aggregates with grounding column.

Pass 1 (10 files × 5 modes broad sweep) currently running via the
existing scripts/mode_experiment.ts — gateway restart deferred until
it completes so the new override knobs aren't enabled mid-experiment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 01:55:12 -05:00
root
52bb216c2d mode_compare: grounding check + control flag + emoji-tolerant section detection
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Three fixes after the playbook_only confabulation surfaced in
2026-04-26 experiment (8 'findings' on a 333-line file all citing
lines 378-945 — fully fabricated from pathway-memory pattern names).

(1) Aggregator regex bug — section detection failed on emoji-prefixed
markdown headers like `## 🔎 Ranked Findings`. The original regex
required word chars right after #{1,3}\s+, so the patches table
header `## 🛠️ Concrete Patch Suggestions` was never recognized as
a stop boundary, double-counting every finding. Fix: allow
non-letter chars (emoji/space) between # and the keyword.

(2) Grounding check — for each finding row in the response, extract
backtick-quoted symbols + cited line numbers; verify symbols exist
in the actual focus file and lines fall within EOF. Computes
grounded/total ratio per mode. Surfaces 'OOB' (out-of-bounds) count
explicitly so confabulation is visible at a glance. Confirms what
hand-grading found: codereview_playbook_only's 8 findings on
service.rs were 1/8 grounded with 7 OOB.

(3) Control mode tagging — codereview_null and codereview_playbook_only
are designed as falsifiers (baseline / lossy ceiling) and their
numerical wins should never be read as recommendations. Output
marks them with ⚗ glyph + warning footer.

Per-mode aggregate is now sorted by groundedness, not raw count.
Per-mode-vs-lakehouse comparison uses grounded findings, not raw —
so confabulation can no longer score a "win".

Updated SCRUM_MASTER_SPEC.md with refactor timeline pointing at
the 2026-04-25/26 commits (observer fix, relevance filter, retire
wire, mode router, enrichment runner, parameterized experiment).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 01:44:21 -05:00
root
7c47734287 v1/mode: parameterized runner + 5 enrichment-experiment modes
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
J's directive (2026-04-26): "Create different modes so we can really
dial in the architecture before it gets further along — pinpoint the
failures and strengths equally so I know what direction to go in.
Loop theater happens when we don't pinpoint the most accurate path."

Refactored execute() to switch on mode name → EnrichmentFlags preset.
Five native modes designed as deliberate experiments — each isolates
one architectural axis so the comparison matrix reads off what's
doing work vs what's adding latency for nothing:

  codereview_lakehouse     — all enrichment on (ceiling)
  codereview_null          — raw file + generic prompt (baseline)
  codereview_isolation     — file + pathway only (no matrix)
  codereview_matrix_only   — file + matrix only (no pathway)
  codereview_playbook_only — pathway only, NO file content (lossy ceiling)

Each call appends a row to data/_kb/mode_experiments.jsonl with full
sources + response. LH_MODE_LOG_OFF=1 to suppress.

scripts/mode_experiment.ts — sweeps files × modes serially, prints
live progress with per-call enrichment stats. Defaults to OpenRouter
free model so cloud quota doesn't gate experiments.

scripts/mode_compare.ts — reads the JSONL, outputs per-file matrix
+ per-mode aggregate + mode-vs-baseline win/loss with avg finding
delta. Heuristic finding-count from markdown table rows; pathway
citation count from preamble references.

scrum_master_pipeline.ts gets a mode-runner fast path gated by
LH_USE_MODE_RUNNER=1: try /v1/mode/execute first, fall through to
the existing ladder if response < LH_MODE_MIN_CHARS (default 2000)
or anything errors. Off by default until A/B-validated.

First experiment results (2 files × 5 modes via gpt-oss-120b:free):
  - codereview_null produces 12.6KB response with ZERO findings
    (proves adversarial framing is load-bearing)
  - codereview_playbook_only produces MORE findings than lakehouse
    on average (12 vs 9) at 73% the latency — pathway memory is
    the dominant signal driver
  - codereview_matrix_only underperforms isolation by ~0.5 findings
    while costing the same latency — matrix corpus likely
    underperforming for scrum_review task class

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 01:36:42 -05:00
root
86f63a083d v1/mode: codereview_lakehouse native runner — modes are prompt-molders
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
J's framing (2026-04-26): "Modes are how you ask ONCE and get BETTER
information — they mold the data, hyperfocus the prompt on this
codebase's needs, so the model gets it right the first time without
the cascading retry ladder."

Built the first concrete native enrichment runner (codereview_lakehouse)
that composes every context primitive the gateway exposes:

  1. Focus file content (read from disk OR caller-supplied)
  2. Pathway memory bug_fingerprints for this file area (ADR-021
     preamble — "📚 BUGS PREVIOUSLY FOUND IN THIS FILE AREA")
  3. Matrix corpus search via the task_class's matrix_corpus
  4. Relevance filter (observer /relevance) drops adjacency pollution
  5. Assembles ONE precise prompt with system framing
  6. Single call to /v1/chat with the recommended model

POST /v1/mode/execute dispatches. Native mode → runs the composer.
Non-native mode → 501 NOT_IMPLEMENTED with hint (proxy to LLM Team
/api/run is queued).

Provider hint logic auto-routes by model name shape:
  - vendor/model[:tag] → openrouter
  - kimi-*/qwen3-coder*/deepseek-v*/mistral-large* → ollama_cloud
  - everything else → local ollama

Live test against crates/queryd/src/delta.rs (10593 bytes, 10
historical bug fingerprints, 2 matrix chunks dropped by relevance):
  - enriched_chars: 12876
  - response_chars: 16346 (14 findings with confidence percentages)
  - Model literally cited the pathway memory preamble in finding #7
  - One call to free-tier gpt-oss:120b produced what previously
    required the 9-rung escalation ladder

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 00:28:46 -05:00
root
d277efbfd2 v1/mode: task_class → mode/model router (decision-only, phase 1)
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
HANDOVER §queued (2026-04-25): "Mode router — port LLM Team multi-model
patterns. Pick the right TOOL/MODE for each task class via the matrix,
not cascade through models."

Two-stage architecture:
  1. Decision (POST /v1/mode) — pure recommendation, no execution.
     Returns {mode, model, decision: {source, fallbacks, matrix_corpus,
     notes}} so callers see WHY this mode was picked.
  2. Execution (future POST /v1/mode/execute) — proxy to LLM Team
     /api/run for modes not yet ported to native Rust runners. Not
     wired in this phase.

Splitting decision from execution lets us A/B-test the routing logic
without committing to running every recommendation. The decision
function is pure enough for exhaustive unit tests (3 added).

config/modes.toml — initial map for 5 task_classes (scrum_review,
contract_analysis, staffing_inference, fact_extract, doc_drift_check)
+ a default. matrix_corpus per task is reserved for the future
matrix-informed routing pass.

VALID_MODES list (24 modes) is kept in sync manually with LLM Team's
/api/run handler at /root/llm_team_ui.py:10581. Adding a mode here
without adding it upstream returns 400 from a future proxy.

GET /v1/mode/list — operator introspection so a UI can render the
registry table without re-parsing TOML.

Live-tested: 5 task classes match, unknown classes fall through to
default, force_mode override works + validates, bogus modes return
400 with the valid_modes list.

Updates reference_llm_team_modes.md memory — earlier note claiming
"only extract is registered" was wrong (all 25 are registered).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 00:16:32 -05:00
root
626f18d491 pathway_memory: audit-consensus → retire wire
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
When observer's hand-review explicitly rejects the output of a
hot-swap-recommended model, the matrix's recommendation was wrong
for this context. Auto-retire the trace so future agents don't
get the same poisoned recommendation in their preamble.

crates/vectord/src/pathway_memory.rs — add `trace_uid` to
HotSwapCandidate response and populate from the matched trace.
This gives consumers single-trace precision for /pathway/retire.

tests/real-world/scrum_master_pipeline.ts:
  - HotSwapCandidate interface gains trace_uid
  - new retirePathwayTrace() helper (fire-and-forget, fall-open)
  - in the obsVerdict reject branch: if hotSwap was active AND
    the rejected model is the hot-swap-recommended one AND
    observer confidence ≥0.7, fire retire and null hotSwap so
    post-loop replay bookkeeping doesn't double-process.
  - hotSwap declared `let` (was const) so it can be nulled

Cycle verdicts ("needs different angle") don't trigger retire —
only outright rejects do. Confidence gate avoids retiring on
heuristic-fallback verdicts that come back without a confidence
number. Closes the "audit-consensus → retire" item from
HANDOVER.md.

Live-tested: insert synthetic trace → /pathway/retire by trace_uid
→ retired counter 1 → 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 00:01:20 -05:00
root
e1ef185868 docs: add MATRIX_AGENT_HANDOVER notes + cross-link from SCRUM_MASTER_SPEC
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
The matrix-agent-validated repo split + Ansible deploy pipeline are
otherwise undocumented in this repo. This new doc explains:
- the relationship between lakehouse and matrix-agent-validated
- where the playbook lives and what it provisions
- the critical distinction: matrix-test (10.111.129.50 Incus container)
  is the REAL destination, while 192.168.1.145 is a smoke-test VPS only
  (partial deploy, no services, do not treat as production)
- what landed today (observer fix, HANDOVER.md render, relevance filter)

Added a cross-link block at the top of SCRUM_MASTER_SPEC.md so the
canonical scrum handoff doc points at the new MATRIX_AGENT_HANDOVER doc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 23:54:42 -05:00
root
0115a60072 observer: add /relevance heuristic filter for adjacency pollution
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Matrix retrieval often surfaces high-cosine chunks that are about
symbols the focus file IMPORTS but doesn't define. The reviewer LLM
then hallucinates those imported-crate internals as in-file content
("I see main.rs does X" when X lives in queryd::context).

mcp-server/relevance.ts — pure scorer with five signals:
  path_match      +1.0  chunk source/doc_id encodes focus path
  defined_match   +0.6  chunk text mentions focus.defined_symbols
  token_overlap   +0.4  jaccard of non-stopword tokens
  prefix_match    +0.3  shared first-2-segment prefix
  import_only    -0.5  mentions only imported symbols (pollution)

Default threshold 0.3 — tuned empirically on the gateway/main.rs case.

Also fixes a regex bug in the import extractor: the character class
was lowercase-only, so `use catalogd::Registry;` silently never
matched (regex backed off when it hit the uppercase R). Caught by
the test suite.

observer.ts — POST /relevance endpoint wraps filterChunks().
scrum_master_pipeline.ts — fetchMatrixContext gains optional
focusContent param; calls /relevance after collecting allHits and
before sort+top. Opt-out via LH_RELEVANCE_FILTER=0; threshold via
LH_RELEVANCE_THRESHOLD. Fall-open on observer failure.

9 unit tests, all green. Live probe on real shape correctly drops
a 0.7-cosine adjacency-pollution chunk while keeping in-focus hits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 23:51:45 -05:00
root
54689d523c observer: fix gateway health probe — text/plain not JSON
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Same bug as matrix-agent-validated 5db0c58. observer.ts:645 did
fetch().then(r => r.json()) against /health which returns text/plain
"lakehouse ok". r.json() throws on non-JSON, .catch swallows to null,
observer exits assuming gateway down. With systemd Restart=on-failure
this crash-loops every 5s — confirmed live on matrix-test box today.

Fix: r.ok ? r.text() : null. Same shape, accepts the actual content
type. Sealed in pathway_memory as TypeConfusion:fetch-health-json.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 23:37:52 -05:00
root
779158a09b scripts: chicago analyzer field-name fixes + vectorize sanitizer hardening
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Two small fixes surfaced during smoke testing:

analyze_chicago_contracts.ts: permit field is contact_1_name not
contact_1; reported_cost is integer-string. Fixed filter (was rejecting
all 2853 permits) and contractor extraction (was empty).

vectorize_raw_corpus.ts: sanitize() expanded to strip control chars +
ALL backslashes (kills incomplete \uXXXX escapes) + UTF-16 surrogates
(unpaired surrogates from emoji split by truncate boundary). Llm_team
response cache had docs with all three pollution shapes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 19:34:45 -05:00
root
6ac7f61819 pathway_memory: Mem0 versioning + deletion (upsert/revise/retire/history)
Per J 2026-04-25: pathway_memory was append-only — every agent run added
a new trace, bad/failed runs polluted the matrix forever, no notion of
"this is the canonical evolved playbook." Ported playbook_memory's
Phase 25/27 patterns into pathway_memory so the agent loop's matrix
converges on best-known approaches per task class instead of bloating.

Fields added to PathwayTrace (all #[serde(default)] for back-compat):
- trace_uid: stable UUID per individual trace within a bucket
- version: u32 default 1
- parent_trace_uid, superseded_at, superseded_by_trace_uid
- retirement_reason (paired with existing retired:bool)

Methods added to PathwayMemory:
- upsert(trace) → PathwayUpsertOutcome {Added|Updated|Noop}
  Workflow-fingerprint dedup: ladder_attempts + final_verdict hash.
  Identical workflow → bumps existing replay_count instead of duplicating.
- revise(parent_uid, new_trace) → PathwayReviseOutcome
  Chains versions; rejects retired or already-superseded parents.
- retire(trace_uid, reason) → bool
  Marks specific trace retired with reason. Idempotent.
- history(trace_uid) → Vec<PathwayTrace>
  Walks parent_trace_uid back to root, then superseded_by forward to tip.
  Cycle-safe via visited set.

Retrieval gates updated:
- query_hot_swap skips superseded_at.is_some()
- bug_fingerprints_for skips both retired AND superseded

HTTP endpoints in service.rs:
- POST /vectors/pathway/upsert
- POST /vectors/pathway/retire
- POST /vectors/pathway/revise
- GET  /vectors/pathway/history/{trace_uid}

scripts/seal_agent_playbook.ts switched insert→upsert + accepts SESSION_DIR
arg so it can seal any archived session, not just iter4.

Verified live (4/4 ops):
- UPSERT first run: Added trace_uid 542ae53f
- UPSERT identical: Updated, replay_count bumped 0→1 (no duplicate)
- REVISE 542ae53f→87a70a61: parent stamped superseded_at, v2 created
- HISTORY of v2: chain_len=2, v1 superseded, v2 tip
- RETIRE iter-6 broken trace: retired=true, retirement_reason preserved
- pathway_memory.stats: total=79, retired=1, reuse_rate=0.0127

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 19:31:44 -05:00
root
ed83754f20 raw-corpus dump + vectorization + chicago contract inference pipeline
Three new pieces, executed in order:

scripts/dump_raw_corpus.sh
- One-shot bash that creates MinIO bucket `raw` and uploads all
  testing corpora as a persistent immutable test set. 365 MB total
  across 5 prefixes (chicago, entities, sec, staffing, llm_team)
  + MANIFEST.json. Sources: workers_500k.parquet (309 MB),
  resumes.parquet, entities.jsonl, sec_company_tickers.json,
  Chicago permits last 30d (2,853 records, 5.4 MB), 9 LLM Team
  Postgres tables dumped via row_to_json.

scripts/vectorize_raw_corpus.ts
- Bun script that fetches each raw-bucket source via mc, runs a
  source-specific extractor into {id, text} docs, posts to
  /vectors/index, polls job to completion. Verified results:
    chicago_permits_v1: 3,420 chunks
    entity_brief_v1:    634 chunks
    sec_tickers_v1:    10,341 chunks (after extractor fix for
                        wrapped {rows: {...}} JSON shape)
    llm_team_runs_v1:  in flight, 19K+ chunks
    llm_team_response_cache_v1: queued

scripts/analyze_chicago_contracts.ts
- Real inference pipeline that picks N high-cost permits with
  named contractors from the raw bucket, queries all 6 contract-
  analysis corpora in parallel via /vectors/search, builds a
  MATRIX CONTEXT preamble, calls Grok 4.1 fast for structured
  staffing analysis, hand-reviews each via observer /review,
  appends to data/_kb/contract_analyses.jsonl.

tests/real-world/scrum_master_pipeline.ts
- MATRIX_CORPORA_FOR_TASK extended with two new task classes:
  contract_analysis (chicago + entity_brief + sec + llm_team_runs
    + llm_team_response_cache + distilled_procedural)
  staffing_inference (workers_500k_v8 + entity_brief + chicago
    + llm_team_runs + distilled_procedural)
  scrum_review unchanged.

This is the first time the matrix architecture operates on real
ingested data instead of code-review smoke tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 18:44:27 -05:00
root
a496ced848 scrum: unified matrix retriever — pull from ALL relevant KB corpora, not just pathway memory
Per J 2026-04-25 architectural correction: matrix index is the vector
indexing layer for the WHOLE knowledge base (distilled facts, procedures,
config hints, team runs, playbooks, pathway successes), not a single
narrow store. Built fetchMatrixContext(query, taskClass, filePath) that:

- Queries multiple persistent vector indexes in parallel via /vectors/search
- Collects hits per corpus + score + doc_id + 400-char excerpt
- Pulls pathway successes via existing helper, mapped to MatrixHit shape
- Sorts by score across corpora, returns top-N (default 8)
- Reports per-corpus hit counts + errors for transparency

Per-task-class corpus list (MATRIX_CORPORA_FOR_TASK):
  scrum_review → distilled_factual, distilled_procedural,
                 distilled_config_hint, kb_team_runs_v1
  (staffing data deliberately excluded — not relevant to code review)

Probed live: distilled_config_hint top hit = 0.52, distilled_procedural
top = 0.49, kb_team_runs top = 0.59. Real signal across corpora.

Replaces the narrow proven-approaches preamble with a unified
MATRIX-INDEXED CONTEXT preamble tagged with source_corpus per chunk
so the model knows what kind of context it's seeing.

LH_SCRUM_MATRIX_RETRIEVE=0 still disables for A/B testing.

Future: promote to a Rust /v1/matrix endpoint once corpora list and
ranking logic stabilize. For now TS lets us iterate fast against the
live matrix without gateway restarts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 18:29:08 -05:00
root
d187bcd8ac scrum: stop cascading models on quality issues — single-model retry with enrichment
Architectural correction (J 2026-04-25):

The 9-rung ladder was treating cascade as the strategy. That's wrong.
ONE model handles the work, with same-model retries using enriched
context. Cycle to a different model ONLY on PROVIDER errors (network
/ auth / 5xx) — never on quality issues, because quality issues mean
the context needs more enrichment, not a different model.

Changes:
- LADDER shrunk from 11 entries to 3 (Grok 4.1 fast primary, DeepSeek
  V4 flash + Qwen3-235B as provider-error fallbacks). Removed Kimi
  K2.6, Gemini 2.5 flash, all Ollama Cloud rungs, OR free-tier rungs,
  local qwen3.5 — none were doing the work, all wasted attempts. They
  remain available as routable tools for the future mode router.
- Loop restructured: separate `modelIdx` from attempt counter.
  Provider error → modelIdx++ (advance fallback). Observer reject /
  cycle / thin response → retry SAME model with rejection notes
  feeding into the `learning` preamble; advance fallback only after
  MAX_QUALITY_RETRIES (default 2) exhausted on the current model.
- LH_SCRUM_MAX_QUALITY_RETRIES env to tune the per-model retry cap.

What this preserves:
- Tree-split (treeSplitFile) is still the ONE legitimate model-switch
  trigger for context-overflow, but even it just re-runs the same
  model against smaller chunks.
- Pathway memory preamble still fires.
- Hot-swap reorder still applies — when a recommended model maps to
  the new shorter ladder.

Future direction (J 2026-04-25 note): the LLM Team multi-model modes
in /root/llm_team_ui.py are a REFERENCE PATTERN for a mode router we
will build INSIDE this gateway. Mimic the patterns, don't modify the
LLM Team UI itself. The mode router will pick the right approach for
each task class via the matrix index, not cascade through models.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 18:08:31 -05:00
root
6432465e2c autonomous_loop: stop clobbering applier model/provider defaults
Found by running: the loop was setting LH_APPLIER_MODEL=qwen3-coder:480b
explicitly via env, which clobbered the applier's NEW default of
x-ai/grok-4.1-fast on openrouter. Result: applier kept hitting the
throttled ollama_cloud account and producing zero patches every iter.

Now LOOP_APPLIER_MODEL and LOOP_APPLIER_PROVIDER are optional overrides;
when unset, scrum_applier.ts uses its own defaults.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 17:54:54 -05:00
root
4ac56564c0 scrum + applier + observer: switch to paid OpenRouter ladder, add Kimi K2.6 + Gemini 2.5
Ollama Cloud was throttled across all 6 cloud rungs in iters 1-9, which
forced the loop into 0-review iterations even though the architecture
was sound. Swapping to paid OpenRouter unblocks the test path.

Ladder changes (top-of-ladder paid models, all under $0.85/M either side):
- moonshotai/kimi-k2.6     ($0.74/$4.66, 256K) — capped at 25/hr
- x-ai/grok-4.1-fast       ($0.20/$0.50, 2M)   — primary general
- google/gemini-2.5-flash  ($0.30/$2.50, 1M)   — Google reasoning
- deepseek/deepseek-v4-flash ($0.14/$0.28, 1M) — cheap workhorse
- qwen/qwen3-235b-a22b-2507  ($0.07/$0.10, 262K) — cheapest big
Existing rungs (Ollama Cloud + free OR + local qwen3.5) kept as fallback.

Per-model rate limiter (MODEL_RATE_LIMITS in scrum_master_pipeline.ts):
- Persists call timestamps to data/_kb/rate_limit_calls.jsonl so caps
  survive process restarts (autonomous loop spawns a fresh subprocess
  per iteration; without persistence each iter would reset)
- O(1) writes, prune-on-read for the rolling 1h window
- Capped models log "SKIP (rate-limited: cap N/hr reached)" and the
  ladder cycles to the next rung
- J directive 2026-04-25: 25/hr on Kimi K2.6 to bound output cost

Observer hand-review cloud tier swapped from ollama_cloud/qwen3-coder:480b
to openrouter/x-ai/grok-4.1-fast — proven to emit precise semantic
verdicts (named "AccessControl::can_access() doesn't exist" specifically
in 2026-04-25 tests instead of the heuristic fallback).

Applier patch emitter swapped from ollama_cloud/qwen3-coder:480b to
openrouter/x-ai/grok-4.1-fast (default; LH_APPLIER_MODEL +
LH_APPLIER_PROVIDER override). This was the third LLM call we missed —
without it, observer accepts a review but applier never produces patches
because its emitter was still hitting the throttled account.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 17:49:02 -05:00
root
e79e51ed70 tests: autonomous_loop.ts — goal-driven scrum + applier retry harness
Wraps tests/real-world/scrum_master_pipeline.ts and scrum_applier.ts in
a single autonomous loop that runs scrum → applier --commit → optional
git push, observes per-iteration outcomes via observer /event, journals
to data/_kb/autonomous_loops.jsonl. Stops when 2 consecutive iters land
zero commits OR LOOP_MAX_ITERS reached.

Env knobs:
  LOOP_TARGETS — comma-sep paths, default 3 high-traffic Lakehouse files
  LOOP_MAX_ITERS — default 3
  LOOP_PUSH=1 — push branch after each commit-landing iter
  LOOP_BRANCH — default scrum/auto-apply-19814 (refuses to run elsewhere)
  LOOP_MIN_CONF — applier min confidence (default 85)
  LOOP_APPLIER_MODEL — default qwen3-coder:480b

Causality preserved: targets pass through to LH_APPLIER_FILES so applier
patches what scrum just reviewed (vs picking from global review history).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 17:32:15 -05:00
root
3f166a5558 scrum + observer: hand-review wire — judgment moved out of the inner loop
Pre-2026-04-25 the scrum_master applied a hardcoded grounding-rate gate
inline. That baked policy into the wrong layer — semantic judgment about
whether a review is grounded belongs in the observer (which has Langfuse
traces, sees every response across the system, and can call cloud LLMs
for real evaluation). Scrum should report DATA, observer DECIDES.

What landed:
- scrum_master_pipeline.ts: removed the inline grounding-pct threshold;
  every accepted candidate now POSTs to observer's /review endpoint with
  {response, source_content, grounding_stats, model, attempt}. Observer
  returns {verdict: accept|reject|cycle, confidence, notes}. On observer
  failure, scrum falls open to accept (observer is policy, not blocker).
- mcp-server/observer.ts: new POST /review endpoint with two-tier
  evaluator. Tier 1: cloud LLM (qwen3-coder:480b at temp=0) hand-reviews
  with full context — response + source excerpt + grounding stats — and
  emits structured verdict JSON. Tier 2: deterministic heuristic over
  grounding pct + total quotes when cloud throttles, marked source:
  "heuristic" so consumers can tune it later by comparing against cloud.
- Every verdict persists to data/_kb/observer_reviews.jsonl with full
  input snapshot so cloud vs heuristic can be A/B compared once cloud
  quota refreshes.

Verified end-to-end: smoke loop iter 1 — observer returned `cycle` on
21% grounding (cycled to next rung), `reject` on 17% (gave up). Iter 2
— `reject` on 12% and 14%. Both UNRESOLVED with honest signal instead
of polluting pathway memory with hallucinated patterns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 17:32:04 -05:00
root
c90a509f49 applier: LH_APPLIER_FILES env to constrain to current-iter targets
Without this, the applier loaded the latest 34 reviews and patched the
highest-confidence file from history — which is meaningless when called
from the autonomous loop where the intent is "review file X this iter,
patch file X this iter." Now the loop passes its targets through and the
applier filters eligible reviews accordingly.

Causality is restored: scrum reviews file X → applier patches file X.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 17:31:49 -05:00
root
9ecc5848fa scrum: blind-response guard + anchor-grounding post-verifier
Two signal-quality fixes for the scrum loop:

1. isBlindResponse() — detects models that emit structurally-valid
   review JSON containing "no source code visible / cannot verify"
   even when source WAS supplied. Rejects so the ladder cycles to
   the next rung instead of accepting the blind hallucination.

2. verifyAnchorGrounding() + appendGroundingFooter() — post-process
   verifier that extracts every backtick-quoted snippet from the
   review and checks it against the original source content.
   Appends a grounding footer reporting grounded vs ungrounded
   counts so humans can audit hallucination rate at a glance.

Born from the iter where llm_team_ui.py review came back with 6/10
findings hallucinated (invented render_template_string calls,
fabricated logger.exception sites, made-up SHA-256 hashing).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 17:07:30 -05:00
root
b843a23433 mcp: contractor entity-brief drill-down + mobile UX pass
Adds /contractor page route plus /intelligence/contractor_profile
endpoint that fans out across OSHA, ticker, history, parent_link,
federal contracts, debarment, NLRB, ILSOS, news, diversity certs,
BLS macro — single per-contractor portfolio view across every
wired source.

search.html: mobile responsive layout, fixed bottom dock with
horizontal scroll-snap, legacy bridge row stacking, viewport
overflow guards.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 17:07:23 -05:00
root
a6c83b03e5 chore: sync Cargo.lock — toml dep for phase-42 rule loader
Pairs retroactively with de8fb10 (truth/ TOML rule loader).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 17:07:16 -05:00
root
858954975b Staffing Co-Pilot UI — architecture-first enrichments + shift clock
Some checks failed
lakehouse/auditor 2 blocking issues: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
J's direction: the dashboard was explanatory but not *actionable* as
a staffing-matrix console. Refactor so the architecture claims from
docs/PRD.md surface as operational signals on every contract card.

Backend (mcp-server/index.ts):

  + GET|POST /intelligence/arch_signals — probes live substrate health
    so the dashboard shows instant-search latency, index shape,
    playbook-memory entries, and pathway-memory (ADR-021) trace count.
    Fires one fresh /vectors/hybrid probe against workers_500k_v1 so
    the "instant search" number on screen is live, not cached.

  * /intelligence/permit_contracts now times every hybrid call per
    contract and returns search_latency_ms, so the card can display
    the per-query latency pill ( 342ms).

  + Per-contract computed fields returned from the backend:
      search_latency_ms      — real /vectors/hybrid duration
      fill_probability       — base_pct (by pool_size×count ratio)
                               + curve [d0, d3, d7, d14, d21, d30]
                               with cumulative fill% per bucket
      economics              — avg_pay_rate, gross_revenue,
                               gross_margin, margin_pct,
                               payout_window_days [30, 45],
                               over_bill_count,
                               over_bill_pool_margin_at_risk
      shifts_needed          — 1st/2nd/3rd/4th inferred from
                               permit work_type + description regex

  * Pre-existing dangling-brace bug in api() fixed (the `activeTrace`
    logging block had been misplaced at module scope, referencing
    variables that only existed inside the function). Restart was
    failing with "Unexpected }" at line 76. Moved tracing inside the
    try block where parsed/path/body/ms are in scope.

Frontend (mcp-server/search.html):

  + Top "Substrate Signals" section — 4 live tiles (instant search,
    index, playbook memory, pathway matrix). Color-codes latency
    (green <100ms, amber <500ms, red otherwise).
  + "24/7 Shift Coverage" section — SVG 24-hour clock with 4 colored
    shift arcs (1st/2nd/3rd/4th), current-time needle, center label
    showing the live shift, per-shift contract count tiles beside.
    4th shift assumes weekend/split; handles 3rd-shift wrap across
    midnight by splitting into two arcs.
  + Per-card architecture pills: instant-search latency, SQL-filter
    pool-size with k=200 boost note, shift requirements.
  + Per-card fill-probability horizontal stacked bar with day
    markers (d0/d3/d7/d14/d21/d30) and per-bucket segment shading
    (green → amber → orange → red as time decays).
  + Per-card economics 4-tile grid: Est. Revenue, Est. Margin (with
    % colored by health), Payout Window (30–45d standard), Over-Bill
    Pool count + margin at risk.

Architecture smoke test (tests/architecture_smoke.ts, earlier commit)
still green: 11/11 pass including the new /intelligence/arch_signals
+ permit_contracts enrichments.

J specifically wanted: "shoot for the stars · hyperfocus · our
architecture is better because it self-regulates, uses hot-swap,
pulls from real data, and shows instant searches from clever
indexing." Every one of those is now a specific visible signal on
the page, not prose in the README.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:24:11 -05:00
root
4a94da2d41 tests/architecture_smoke — PRD-invariant probe against 500k workers
J's reset: I'd been iterating on pipeline internals without a
driver. The PRD says staffing is the REFERENCE consumer, not the
domain driver — the architecture is the thing. This test makes
that explicit.

8 sections exercise the PRD §Shared requirements against live
production-shaped data (500k workers parquet, 50k-chunk vector
index, 768d nomic embeddings):

  1. preconditions       — gateway + sidecar alive
  2. catalog lookup      — workers_500k resolves to 500000 rows
  3. SQL at scale        — count(*) + geo filter on 500k rows
  4. vector search       — /vectors/search returns top-k
  5. hybrid SQL+vector   — /vectors/hybrid with sql_filter
  6. playbook_memory     — /vectors/playbook_memory/stats
  7. pathway_memory      — ADR-021 stats + bug_fingerprints
  8. truth gate          — DROP TABLE blocked with 403

No cloud calls. Completes in ~5 seconds. Exits non-zero on any
failure; failure messages print "these are the next things to fix."

First-run measurements against current code:
  - 500k COUNT(*) = 22ms, OH-filtered = 20ms (invariant met)
  - vector search p=368ms on 10-NN
  - hybrid p=4662ms, returned 0 Toledo-OH hits (two signals worth
    investigating: the latency AND the empty result)
  - playbook_memory = 0 entries (rebuild never fired since boot)

The 11/11 pass means the substrate's contract is intact. The
measurements tell us WHERE to look next, not what to speculate.

Going forward: this script is the canary. Run it after every
substantive change. If a section flips from pass to fail, that IS
the regression; roll back or fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:12:14 -05:00
root
4087dde780 execution_loop: update stale test assertion to match current prompt format
Some checks failed
lakehouse/auditor 2 blocking issues: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Pre-existing failure I've been noting across this session —
`executor_prompt_includes_surfaced_candidates` expected the substring
"W-1 Alice Smith" but the prompt format was intentionally changed
(probably in a Phase 38/39 commit) to separate doc_id from name so
the executor doesn't conflate `doc_id` (vector-index key) with
`workers_500k.worker_id` (integer PK).

Current prompt format (line 1178 in build_executor_prompt):
  - name="Alice Smith"  city="Toledo"  state="OH"  (vector doc_id=W-1)

The prompt body explicitly instructs the model NOT to conflate the
two IDs — the format separation is the mechanism enforcing that
instruction. The OLD test assertion predated that separation.

Assertion now checks the semantic contract (both tokens present,
any order) instead of the exact old concatenation.

Workspace test result after this commit: 343 passed, 0 failed, 0
warnings (both lib + tests).

This is the last stale-test hole from the phase-audit sweep — it
popped up during the 41-commit push but I was leaving it as
pre-existing-unrelated. J called it: sitting broken for hours is
worse than a one-line assertion update.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:06:24 -05:00
root
951c6014ec gateway: boot-time probe of truth/ file-backed rules
Phase 42 PRD deliverable de8fb10 landed the file loader + 2 rule
files. This commit wires the loader into gateway startup so the
rules actually get READ at boot — catches parse errors and
duplicate-ID collisions before the first request hits, rather than
"silently 0 rules loaded."

Scope is deliberately narrow — a probe, not full plumbing:

  - Reads LAKEHOUSE_TRUTH_DIR env override, defaults to
    /home/profit/lakehouse/truth
  - Skips silently with a debug log if the dir is absent
  - Loads rules on top of default_truth_store() into a throwaway
    store, logs the count (or the error)
  - Does NOT yet replace the per-request default_truth_store() in
    execution_loop or v1/chat. That plumbing needs a V1State.truth
    field + passing it through the request context, which is a
    separate scope.

Why the separation matters: this commit gives ops + me a visible
boot-time signal ("truth: loaded 3 file-backed rule(s)") that the
loader + files work end-to-end. The next commit can confidently
swap per-request stores without wondering whether the parsing even
succeeds.

Workspace warnings still at 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:03:17 -05:00
root
fee094f653 gateway/access: wire get_role + is_enabled into HTTP routes
Two of the four #[allow(dead_code)] methods in access.rs were dead
because nothing exposed them externally. access.rs itself is fine —
list_roles, set_role, can_access all have live callers. But get_role
and is_enabled were shaped as public API with no surface to call
them through.

Fix adds two small routes under /access (where the rest of the
access surface lives):

  GET /access/roles/{agent}
    Calls AccessControl::get_role(agent). Returns 404 with a clear
    message when the agent isn't registered so clients distinguish
    "unknown agent" from "access denied." Part of P13-001
    (ops tooling needs per-agent role introspection).

  GET /access/enabled
    Calls AccessControl::is_enabled(). Returns {"enabled": bool}.
    Dashboards + ops tooling poll this to confirm auth posture of
    the running gateway — distinct from /health which answers
    "is the process up" vs "is access enforcement on."

#[allow(dead_code)] removed from both methods — they have live
callers now via these routes, the linter will enforce that going
forward.

Still #[allow(dead_code)] on access.rs: masked_fields + log_query.
Both need cross-crate wiring:
  - masked_fields wants the agent's role + query response columns,
    called in response shaping (queryd returning to gateway path)
  - log_query wants post-execution audit, called after every SQL
    execution on the gateway boundary
Both are P13-001 phase 2 work — need AgentIdentity plumbed through
the /query nested router before the call sites make sense. Flagged
for follow-up.

Workspace warnings still at 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:02:01 -05:00
root
91a38dc20b vectord/index_registry: add last_used + build_signature (scrum iter 11)
Scrum iter 11 on crates/vectord/src/index_registry.rs flagged two
concrete field gaps (90% confidence). Both were tagged UnitMismatch
/ missing-invariant.

IndexMeta gains two Optional fields:

  last_used: Option<DateTime<Utc>>
    PRD 11.3 — when this index was last searched against. Callers
    were reading created_at as a liveness proxy, which conflated
    "built" with "used." IndexRegistry::touch_used(name) stamps the
    field on every hit; incremental re-embed can now skip cold
    indexes without misattributing "fresh build" to "recent use."

  build_signature: Option<String>
    PRD 11.3 — stable SHA-256 of (sorted source files + chunk_size
    + overlap + model_version). compute_build_signature() in the
    same module is deterministic: file-order-invariant, changes on
    chunk param, changes on model version. Lets incremental re-embed
    answer "has anything changed since last build?" without scanning
    the source Parquet.

Both fields are #[serde(default)] — the ~40 existing .json meta
files under vectors/meta/ load unchanged. Backward-compat verified
by the explicit `index_meta_deserializes_without_new_fields_backcompat`
test.

7 new tests:
  - build_signature_is_deterministic
  - build_signature_order_invariant (sorted internally)
  - build_signature_changes_on_chunk_param
  - build_signature_changes_on_model_version
  - touch_used_updates_last_used
  - touch_used_is_noop_on_missing_index
  - index_meta_deserializes_without_new_fields_backcompat

Call-site fixes: crates/vectord/src/refresh.rs:294 and
crates/vectord/src/service.rs:244 both construct IndexMeta with
fully-literal init, default the new fields to None. One
indentation cleanup on service.rs (a pre-existing visual issue on
id_prefix: None).

Workspace warnings still at 0. touch_used() isn't wired into search
hot-path yet — follow-up commit when the search handlers can
adopt it without a broader refactor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:00:09 -05:00
root
6532938e85 gateway/tools: truth gate for model-provided SQL (iter 11 CF-1+CF-2)
Scrum iter 11 flagged crates/gateway/src/tools/service.rs with two
95%-confidence critical failures:

  CF-1: "Direct SQL execution from model-provided parameters without
         explicit validation or sanitization" (line 68, 95% conf)
  CF-2: "No permission check performed before executing SQL query;
         access control is bypassed entirely" (line 102, 90% conf)

CF-1 is the real one — same security gap as queryd /sql had before
P42-002 (9cc0ceb). Tool invocations build SQL from a template +
model-provided params, then state.query_fn.execute(&sql) runs it.
No truth-gate check between build and execute meant an adversarial
model could emit DROP TABLE / DELETE FROM / TRUNCATE inside a param
and bypass queryd's gate by routing through the tool surface instead.

Fix mirrors the queryd SQL gate exactly:
  - ToolState grows an Arc<TruthStore> field
  - main.rs constructs it via truth::sql_query_guard_store()
    (shared default — same destructive-verb block as queryd)
  - call_tool evaluates the built SQL against "sql_query" task class
    BEFORE executing
  - Any Reject/Block outcome → 403 FORBIDDEN + log_invocation row
    marked success=false with the rule message

CF-2 (access control) is P13-001 territory — needs AccessControl
wiring into queryd first, still open. Flagged in memory.

Workspace warnings still at 0. Pattern is now:
  queryd /sql        → truth::sql_query_guard_store (9cc0ceb)
  gateway /tools     → truth::sql_query_guard_store (this commit)
  execution_loop     → truth::default_truth_store (51a1aa3)
All three surfaces that pipe SQL or spec-shaped data through to the
substrate now gate it. Any new SQL-executing surface should follow
the same pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:52:29 -05:00
root
de8fb10f52 phase-42: truth/ repo-root dir + TOML rule loader
Some checks failed
lakehouse/auditor 4 blocking issues: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Phase 42 PRD (docs/CONTROL_PLANE_PRD.md:144): "truth/ dir at repo
root — rule files, versioned in git." Didn't exist. Landing both the
dir + its loader.

New files:

  truth/
    README.md                — documents file format, rule shape,
                               composition model (file rules are
                               additive on top of in-code default_
                               truth_store), explicit non-goals
                               (no hot reload, no inheritance)
    staffing.fill.toml       — 2 staffing.fill rules:
                               endorsed-count-matches-target,
                               city-required (both Reject via
                               FieldEmpty)
    staffing.any.toml        — 1 staffing.any rule:
                               no-destructive-sql-in-context via
                               FieldContainsAny (parallel to the
                               queryd SQL gate we already ship)

  crates/truth/src/loader.rs — load_from_dir(store, dir)
                             — 5 tests: happy path, duplicate-ID
                               rejection within files, duplicate-ID
                               rejection against in-code rules,
                               non-toml files skipped, missing-dir
                               error. Alphabetical file order for
                               reproducible error messages.

  crates/truth/src/lib.rs    — new pub fn all_rule_ids() helper on
                               TruthStore so the loader can detect
                               collisions without breaching the
                               private `rules` field.

  crates/truth/Cargo.toml    — adds `toml` workspace dep.

Composition model: file rules are ADDITIVE on top of what
default_truth_store() registers in code. Operators can tune
thresholds/needles/descriptions at the file layer without a code
deploy. Schema changes (new RuleCondition variants) still need a
code bump.

Integration hook (not in this commit, flagged for follow-up):
main.rs should call loader::load_from_dir(&mut store, "truth/")
after default_truth_store() so file-backed rules take effect on
gateway boot. Deliberately separate: this commit lands the
machinery; wiring it on happens when the team is ready to own
the rule file lifecycle.

Total: 37 truth tests green (was 32). Workspace warnings still 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:44:23 -05:00
root
0b3bd28cf8 phase-40: Gemini + Claude provider adapters
Phase 40 PRD (docs/CONTROL_PLANE_PRD.md:82-83) listed:
  - crates/aibridge/src/providers/gemini.rs
  - crates/aibridge/src/providers/claude.rs

Neither existed. Landing both now, in gateway/src/v1/ (matches the
existing ollama.rs + openrouter.rs sibling pattern — aibridge's
providers/ is for the adapter *trait* abstractions, v1/ holds the
concrete /v1/chat dispatchers that know the wire format).

gemini.rs:
  - POST https://generativelanguage.googleapis.com/v1beta/models/
    {model}:generateContent?key=<API_KEY>
  - Auth: query-string key (not bearer)
  - Maps messages → contents+parts (Gemini's wire shape),
    extracts from candidates[0].content.parts[0].text
  - 3 tests: key resolution, body serialization (camelCase
    generationConfig + maxOutputTokens), prefix-strip

claude.rs:
  - POST https://api.anthropic.com/v1/messages
  - Auth: x-api-key header + anthropic-version: 2023-06-01
  - Carries system prompt in top-level `system` field (not
    messages[]). Extracts from content[0].text where type=="text"
  - 4 tests: key resolution, body serialization with/without
    system field, prefix-strip

v1/mod.rs:
  + V1State.gemini_key + claude_key Option<String>
  + resolve_provider() strips "gemini/" and "claude/" prefixes
  + /v1/chat dispatcher handles "gemini" + "claude"/"anthropic"
  + 2 new resolve_provider tests (prefix + strip per adapter)

main.rs:
  + Construct both keys at startup via resolve_*_key() helpers.
    Missing keys log at debug (not warn) since these are optional
    providers — unlike OpenRouter which is the rescue rung.

Every /v1/chat error path mirrors the existing pattern:
  - 503 SERVICE_UNAVAILABLE when key isn't configured
  - 502 BAD_GATEWAY with the provider's error text when the
    upstream call fails
  - Response shape always the OpenAI-compatible ChatResponse

Workspace warnings still at 0. 9 new tests pass.

Pre-existing test failure `executor_prompt_includes_surfaced_
candidates` at execution_loop/mod.rs:1550 is unrelated (fails on
pristine HEAD too — PR fixture divergence).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:41:31 -05:00
root
b5b0c00efe phase-43: new crates/validator — trait, staffing impls, devops scaffold
Some checks failed
lakehouse/auditor 3 blocking issues: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Phase 43 PRD (docs/CONTROL_PLANE_PRD.md:161) was the one audit finding
truly unimplemented — no crate, no trait, no tests, no workspace entry.
Neither PHASES.md nor the source tree had any Phase 43 presence.
Genuine greenfield gap.

Lands the scaffold as a real crate, registered in workspace Cargo.toml:

  crates/validator/
    src/lib.rs            — Validator trait, Artifact enum (5 variants:
                            FillProposal, EmailDraft, Playbook,
                            TerraformPlan, AnsiblePlaybook), Report,
                            Finding, Severity, ValidationError
    src/staffing/mod.rs   — staffing validators module root
    src/staffing/fill.rs  — FillValidator (schema-level: fills array
                            + per-fill {candidate_id, name}). 4 tests.
                            Worker-existence + status + geo checks
                            are TODO v2 (need catalog query handle).
    src/staffing/email.rs — EmailValidator (to/body schema + SMS ≤160
                            + email subject ≤78). 4 tests. PII scan +
                            name-consistency TODO v2.
    src/staffing/playbook.rs — PlaybookValidator (operation prefix,
                            endorsed_names non-empty + ≤ target×2,
                            fingerprint present per Phase 25). 5 tests.
    src/devops.rs         — TerraformValidator + AnsibleValidator
                            scaffolds. Return Unimplemented — keeps
                            dispatcher shape stable, surfaces a clear
                            "phase 43 not wired" signal instead of
                            silently passing or panicking.

Total: 15 tests, all green. Covers the happy paths, the common
failure modes (missing fields, overfull arrays, length violations),
and the dispatch-error path (wrong artifact type into wrong validator).

Still open from Phase 43 (v2 work, beyond scaffold):
  - FillValidator catalog integration (worker-existence, status,
    geo/role match) — needs catalog handle in constructor
  - EmailValidator PII scan (shared::pii::strip_pii integration) +
    name-consistency cross-check
  - Execution loop wiring: generate → validate → observer correction
    + retry (bounded by max_iterations=3) — spans crates, follow-up
  - Observer logging: validation results to data/_observer/ops.jsonl
    and data/_kb/outcomes.jsonl
  - Scenario fixture tests against tests/multi-agent/playbooks/*

Workspace warnings still at 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:35:22 -05:00
root
2f1b9c9768 phase-39+41: land promised artifacts — providers.toml, activation.rs, profiles/
Three PRD gaps closed in one coherent batch — all were cosmetic or
scaffold-shaped, now real files:

Phase 39 (PRD:57):
  + config/providers.toml — provider registry (name/base_url/auth/
    default_model) for ollama, ollama_cloud, openrouter. Commented
    stubs for gemini + claude pending adapter work. Secrets stay in
    /etc/lakehouse/secrets.toml or env, NEVER inline.

Phase 41 (PRD:115):
  + crates/vectord/src/activation.rs — ActivationTracker with the
    PRD-named single-flight guard ("refuse new activation if one is
    pending/running"). Per-profile granularity — activating A doesn't
    block B. 5 tests cover the full state machine. Handler body stays
    in service.rs for now; tracker usage integration is a follow-up.

Phase 41 (PRD:113):
  + crates/shared/src/profiles/ with 4 submodules:
      * execution.rs — `pub use crate::types::ModelProfile as
        ExecutionProfile` (backward-compat rename per PRD)
      * retrieval.rs — top_k, rerank_top_k, freshness cutoff,
        playbook boost, sensitivity-gate enforcement
      * memory.rs — playbook boost ceiling, history cap, doc
        staleness, auto-retire-on-failure
      * observer.rs — failure cluster size, alert cooldown, ring
        size, langfuse forwarding
    All fields `#[serde(default)]` so existing ModelProfile files
    load unchanged.

Still open from the same phases:
  - Gemini + Claude provider adapters (Phase 40 — 100-200 LOC each)
  - Full activate_profile handler extraction into activation.rs
    (Phase 41 — module-structure refactor)
  - Catalogd CRUD endpoints for retrieval/memory/observer profiles
    (Phase 41 — exists at list level, no create/update/delete yet)
  - truth/ repo-root directory for file-backed rules (Phase 42 —
    TOML loader + schema)
  - crates/validator crate (Phase 43 — full greenfield)

Workspace warnings still at 0. 5 new tests, all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:32:40 -05:00
root
021c1b557f agent.ts: route generateCloud through /v1/chat (Phase 44 migration)
Phase 44 PRD (docs/CONTROL_PLANE_PRD.md:204) explicitly lists
`tests/multi-agent/agent.ts::generate()` as a migration target:
every internal LLM caller must flow through /v1/chat so usage
accounting + audit trail see all traffic.

generateCloud() was bypassing the gateway entirely — direct POST to
OLLAMA_CLOUD_URL/api/generate with the bearer key. This meant:
  - /v1/usage missed every agent.ts cloud call
  - No gateway-side caching, rate-limiting, or cost gating
  - Callers needed OLLAMA_CLOUD_KEY in env (leak risk; gateway
    already owns the key)

Migration:
  - Endpoint: OLLAMA_CLOUD_URL/api/generate → GATEWAY/v1/chat
  - Body shape: {prompt,options.num_predict,options.temperature} →
    OpenAI-compatible {messages[],temperature,max_tokens}
  - provider: "ollama_cloud" explicit in the request
  - Response extraction: data.response → data.choices[0].message.content
  - OLLAMA_CLOUD_KEY no longer required in agent.ts env

Phase 44 gate verified: `grep localhost:3200/generate|/api/generate`
now only hits (a) the ollama_cloud.rs adapter itself (legit — it's
the gateway-side direct caller) and (b) this comment explaining the
migration history. Zero non-adapter code paths to /api/generate.

generate() (local Ollama) still goes direct to :3200 — that's the
t1_hot path. Phase 44 PRD focuses on cloud callers; hot-path local
generation deliberately stays direct for latency.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:27:54 -05:00
root
049a4b69fb truth: split staffing + devops into dedicated modules (Phase 42 PRD)
Phase 42 PRD (docs/CONTROL_PLANE_PRD.md:137) specified:
  - crates/truth/src/staffing.rs — staffing rule shapes
  - crates/truth/src/devops.rs — scaffold for DevOps long-horizon

PHASES.md marked Phase 42 done, but the rule sets lived inline in
default_truth_store() in lib.rs. Worked, but doesn't match the PRD's
module separation — and that separation matters when the long-horizon
phase fleshes out devops rules: "Keeps the dispatcher signature stable
so no refactor needed later."

Fix: extract staffing_rules() into staffing.rs (5 rules, unchanged
behavior) + create devops.rs with an empty scaffold. default_truth_store
becomes a one-line composition:
    devops::devops_rules(staffing::staffing_rules(TruthStore::new()))

4 new tests in the submodules cover:
  - staffing_rules registers expected count (regression guard)
  - blacklisted worker fails the client-not-blacklisted rule
  - missing deadline fires Reject via FieldEmpty condition
  - devops scaffold is a no-op for now

Total truth tests: 28 → 32. Workspace warnings still at 0.

Still open from Phase 42 (flagged, not in this commit):
  - `truth/` dir at repo root for file-backed rule loading (TOML/YAML).
    Rules are in-code today; loader work is a separate feature.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:25:54 -05:00
root
ed85620558 scrum: filter table-header words from bug_fingerprint extraction
Iter 11 surfaced "DeadCode:Flag" in the matrix — a noisy pattern_key
where "Flag" is the table column HEADER kimi produces for structured
review output, not an actual Rust identifier.

Kimi's standard format on recent iters:
  | # | Change                    | Flag       | Confidence |
  | 1 | Wire AgentIdentity into.. | Boundary.. | 92%        |

The extractor's KEYWORDS set already filtered Rust grammar words
(self, mut, async, etc) and the FLAG_VARIANTS themselves. Adding
markdown-layout words (Flag, Change, Confidence, PRD, Plan) closes
the last common noise class.

One-line addition — empirically validated against the iter 11
vectord trace that produced DeadCode:Flag. Future iters won't
reproduce that specific noise.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:22:50 -05:00
root
08cc960115 vectord: Phase 41 gate fixes — 202 ACCEPTED + /profile/jobs/{id} alias
Phase 41 PRD (docs/CONTROL_PLANE_PRD.md:121) gate:
  "Activate a profile → returns 202 in <100ms → job completes in
   background → /vectors/profile/jobs/{id} shows progress"

Two concrete mismatches to PRD:

1. activate_profile returned HTTP 200, not 202. Fix: wrap the Json
   return in (StatusCode::ACCEPTED, Json(...)) so the async semantics
   are visible at the status-code level.

2. The PRD quotes GET /vectors/profile/jobs/{id} but code only exposed
   /vectors/jobs/{id}. Fix: add an alias route — same get_job handler,
   second URL matches what the PRD's polling example documents.

Still open from Phase 41 (flagged for follow-up, bigger scope):
  - crates/shared/src/profiles/ module with ExecutionProfile,
    RetrievalProfile, MemoryProfile, ObserverProfile types — PRD
    claims them, file doesn't exist; ModelProfile still does all
    four roles today. This is a real schema-refactor, not 6-line work.
  - crates/vectord/src/activation.rs with ActivationTracker — the
    activation logic lives inline in service.rs; extracting it is
    a module-structure change.
  - Phase 37 hot-swap stress test in tests/multi-agent/run_stress.ts
    Phase 3 — PRD says it must pass, current state unknown.

Workspace warnings still at 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:21:49 -05:00
root
24b06d80b2 mcp: register gitea-mcp server — closes Phase 40 repo-ops gap
Phase 40 PRD (docs/CONTROL_PLANE_PRD.md:91) claimed:
  "Gitea MCP reconnect — the MCP server binary still installed at
   /home/profit/.bun/install/cache/gitea-mcp@0.0.10/ gets wired into
   mcp-server/index.ts tool registry."

The PHASES.md checkbox marked this done, but audit found:
  - gitea-mcp binary exists in bun cache (verified)
  - Zero references to gitea/list_prs/open_pr in mcp-server/index.ts
  - No entry for "gitea" in .mcp.json

The PRD's architectural description ("wired into mcp-server/index.ts
tool registry") is conceptually wrong — gitea-mcp is a peer MCP server
that the MCP host (Claude Code) connects to directly, not a library
to import. Correct wiring: register it in .mcp.json so Claude Code
spawns both lakehouse's MCP server AND gitea-mcp as separate children,
each exposing their own tools.

This commit adds the "gitea" entry to .mcp.json pointing at bunx
gitea-mcp with GITEA_HOST set to git.agentview.dev.

OPERATOR STEP (one-time): before restarting Claude Code, generate a
personal access token at https://git.agentview.dev/user/settings/
applications and replace the SET_ME_... placeholder in
GITEA_ACCESS_TOKEN. Token needs at minimum `read:repository,
write:issue, read:user` scopes for list_prs/open_pr/comment_on_issue.

Still open from Phase 40 (not in this commit, bigger scope):
  - crates/aibridge/src/providers/gemini.rs (claimed, missing)
  - crates/aibridge/src/providers/claude.rs (claimed, missing)
These are ~100-200 lines each (full HTTP adapter + auth + request
shape mapping). Flag as follow-up commits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:19:46 -05:00
root
999abd6999 gateway/v1: model-prefix routing closes Phase 39 PRD gate
Some checks failed
lakehouse/auditor 4 blocking issues: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Phase 39 PRD (docs/CONTROL_PLANE_PRD.md:62) promised:
  "/v1/chat routes by `model` field: prefix match
   (e.g. openrouter/anthropic/claude-3.5-sonnet → OpenRouter;
   bare names → Ollama)"

Actual behavior required clients to pass `provider: "openrouter"`
explicitly. Bare `model: "openrouter/..."` would fall through to the
"unknown provider ''" error. PRD gate never actually passed.

Fix: resolve_provider(&ChatRequest) picks (provider, effective_model):
  - explicit `req.provider` wins, model passes through unchanged
  - else strip "openrouter/" prefix → provider="openrouter", model
    without prefix (OpenRouter API expects "openai/gpt-4o-mini",
    not "openrouter/openai/gpt-4o-mini")
  - else strip "cloud/" prefix → provider="ollama_cloud"
  - else default provider="ollama"

Adapter calls use Cow<ChatRequest>: borrowed when no strip needed
(zero alloc), owned when we needed to build a new model string. Keeps
the hot path allocation-free for the common case.

ChatRequest gains #[derive(Clone)] — needed for the Owned variant.
5 new tests pin the resolution semantics including the
"explicit provider + prefixed model" corner case (trust the caller,
don't double-strip).

Workspace warnings unchanged at 0.

Still not shipped from Phase 39: config/providers.toml — hardcoded
match arms work fine in practice, centralizing them is cosmetic.
Flag as a follow-up if a 4th provider lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:16:36 -05:00
root
0cf1b7c45a scrum_master: env-configurable tree-split threshold + shard size
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Hard-coded constants (FILE_TREE_SPLIT_THRESHOLD=6000, FILE_SHARD_SIZE=3500)
were tuned for Rust source files in crates/<crate>/src/*.rs. Running
the pipeline against /root/llm-team-ui/llm_team_ui.py (13K lines, ~400KB)
would produce ~200 shards per review at the default size — not viable.

Two env vars now:
  - LH_SCRUM_TREE_SPLIT_THRESHOLD — when tree-split fires (default 6000)
  - LH_SCRUM_SHARD_SIZE — bytes per shard (default 3500)

For the big-Python case the CLAUDE.md in /root/llm-team-ui/ recommends
LH_SCRUM_TREE_SPLIT_THRESHOLD=20000, LH_SCRUM_SHARD_SIZE=12000 which
brings the 13K-line file down to ~35 shards — same ballpark as a
typical Rust file review.

No default change. Existing lakehouse runs unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:02:45 -05:00
root
81bae108f4 gateway/tools: collapse ToolRegistry::new() and new_with_defaults() into one
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Two constructors existed with a subtle trap:

  - `new()` had `#[allow(dead_code)]` and called `register_defaults()`
    via `tokio::task::block_in_place(...)` — a sync wrapper hack around
    an async method, fragile and unused.
  - `new_with_defaults()` was misleadingly named — it created the empty
    registry WITHOUT registering defaults, despite the name.

main.rs was doing the right thing: `new_with_defaults()` + explicit
`.register_defaults().await`. The misleading name was a landmine
for future callers.

Fix: delete the dead `new()` with its block_in_place hack, rename
`new_with_defaults()` → `new()` (Rust idiom — `new` is the canonical
constructor), add a docstring that says what you need to do after.
Single clear API.

Update the one caller in main.rs. Workspace warnings still at 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 06:44:18 -05:00
root
5df4d48109 cleanup: drop two #[allow] attributes that were hiding real dead code
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
- ingestd/src/service.rs: top-of-file `#[allow(unused_imports)]`
    was masking genuinely unused `delete` and `patch` routing
    constructors in an axum import block. Removed the attribute,
    trimmed the imports to only `get` and `post` (what's actually
    used). Any future over-import now trips the unused_imports
    lint immediately instead of being silently allowed.

  - gateway/src/v1/truth.rs: `truth_router()` was a 4-line stub
    wrapping a single `/context` route — carried `#[allow(dead_code)]`
    because v1/mod.rs wires `get(truth::context)` directly onto its
    own router, bypassing this helper. Zero callers across the
    workspace. Deleted the function + allow + now-unused Router
    import. Left a breadcrumb comment pointing to the real wiring.

Workspace warnings: 0 (lib + tests). Each #[allow] removed raises
the bar on future code entering these modules — the linter now
catches the same classes of bugs at PR time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 06:42:49 -05:00
root
ffdc842ec3 ingestd: scope test-only imports into the test module
schema_evolution.rs had two `#[allow(unused_imports)]` attributes hiding
over-broad top-level imports:
  - `Schema` was imported at crate level but only used in test code
  - `Arc` was imported at crate level but only used in test code
  - `DataType` and `SchemaRef` were actually used (28 references) — the
    allow on that line was cargo-culted.

Fix: drop the allows, move Schema + Arc into the #[cfg(test)] block
where they're actually used. The non-test build no longer imports
symbols it doesn't need. Test build still works because the imports
are now in the test module's scope.

Workspace warnings still at 0 (lib + tests). Net: -3 import lines
from crate scope, +2 into test scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 06:41:15 -05:00
root
12e615bb5d ingestd/vectord: remove two fragile unwraps on Option paths
Some checks failed
lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
Both were technically safe — guarded above by map_or(true, ...) and
Some(entry) assignment respectively — but relied on multi-line
invariants that a future refactor could easily break.

  - ingestd/watcher.rs:80: path.file_name().unwrap() on a path that
    was already checked via map_or(true, ...) two lines up. Fix:
    let-else binds filename once, no double lookup, no unwrap.

  - vectord/promotion.rs:145: file.current.as_ref().unwrap() called
    TWICE on the same line to log config + trial_id. Guard via
    `if let Some(cur) = &file.current` so the log gracefully skips
    if the invariant ever breaks instead of panicking at runtime.

Both are drop-in semantically: happy path identical, error path now
graceful-skip instead of panic. Workspace warnings still at 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 06:39:40 -05:00