lakehouse

profit/lakehouse

Fork 0

Commit Graph

Author	SHA1	Message	Date
root	52bb216c2d	mode_compare: grounding check + control flag + emoji-tolerant section detection Some checks failed lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts Three fixes after the playbook_only confabulation surfaced in 2026-04-26 experiment (8 'findings' on a 333-line file all citing lines 378-945 — fully fabricated from pathway-memory pattern names). (1) Aggregator regex bug — section detection failed on emoji-prefixed markdown headers like `## 🔎 Ranked Findings`. The original regex required word chars right after #{1,3}\s+, so the patches table header `## 🛠️ Concrete Patch Suggestions` was never recognized as a stop boundary, double-counting every finding. Fix: allow non-letter chars (emoji/space) between # and the keyword. (2) Grounding check — for each finding row in the response, extract backtick-quoted symbols + cited line numbers; verify symbols exist in the actual focus file and lines fall within EOF. Computes grounded/total ratio per mode. Surfaces 'OOB' (out-of-bounds) count explicitly so confabulation is visible at a glance. Confirms what hand-grading found: codereview_playbook_only's 8 findings on service.rs were 1/8 grounded with 7 OOB. (3) Control mode tagging — codereview_null and codereview_playbook_only are designed as falsifiers (baseline / lossy ceiling) and their numerical wins should never be read as recommendations. Output marks them with ⚗ glyph + warning footer. Per-mode aggregate is now sorted by groundedness, not raw count. Per-mode-vs-lakehouse comparison uses grounded findings, not raw — so confabulation can no longer score a "win". Updated SCRUM_MASTER_SPEC.md with refactor timeline pointing at the 2026-04-25/26 commits (observer fix, relevance filter, retire wire, mode router, enrichment runner, parameterized experiment). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 01:44:21 -05:00
root	e1ef185868	docs: add MATRIX_AGENT_HANDOVER notes + cross-link from SCRUM_MASTER_SPEC Some checks failed lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts The matrix-agent-validated repo split + Ansible deploy pipeline are otherwise undocumented in this repo. This new doc explains: - the relationship between lakehouse and matrix-agent-validated - where the playbook lives and what it provisions - the critical distinction: matrix-test (10.111.129.50 Incus container) is the REAL destination, while 192.168.1.145 is a smoke-test VPS only (partial deploy, no services, do not treat as production) - what landed today (observer fix, HANDOVER.md render, relevance filter) Added a cross-link block at the top of SCRUM_MASTER_SPEC.md so the canonical scrum handoff doc points at the new MATRIX_AGENT_HANDOVER doc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-25 23:54:42 -05:00
root	fd92a9a0d0	docs: SCRUM_MASTER_SPEC.md — single handoff artifact for the scrum loop Some checks failed lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts Fresh-session artifact so work is recoverable if the branch is reopened in a new Claude Code session without context. Covers: - 9-rung ladder (kimi-k2:1t through local qwen3.5:latest) - tree-split reducer (files >6KB sharded + map→reduce) - schema_v4 KB rows in data/_kb/scrum_reviews.jsonl - auto-applier 5 hardened gates (confidence, size, cargo-green, warning-count, rationale-diff) - pathway_memory (ADR-021) — narrow fingerprint + hot-swap gate + semantic-correctness layer (SemanticFlag, BugFingerprint) - HTTP surface on gateway (/vectors/pathway/*) - current state (12 traces, 11 fingerprints, 0 hot-swaps — probation) - commit history on scrum/auto-apply-19814 since iter-5 baseline - how-to-run (env vars, service restarts) - where things live (code pointers table) - known gotchas (LLM Team mode registry, restart requirements) Paired updates (not in this commit, live outside the repo): - /home/profit/CLAUDE.md — active workstream pointer + notes - /root/.claude/skills/read-mem/SKILL.md — SCRUM_MASTER_SPEC.md added to the loading list + ADR-021 glossary - memory/project_scrum_pipeline.md — updated with iter-9 state - memory/feedback_semantic_correctness_via_matrix.md — updated with end-to-end proof evidence Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 06:15:53 -05:00

Author

SHA1

Message

Date

root

52bb216c2d

mode_compare: grounding check + control flag + emoji-tolerant section detection

lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts

Three fixes after the playbook_only confabulation surfaced in
2026-04-26 experiment (8 'findings' on a 333-line file all citing
lines 378-945 — fully fabricated from pathway-memory pattern names).

(1) Aggregator regex bug — section detection failed on emoji-prefixed
markdown headers like `## 🔎 Ranked Findings`. The original regex
required word chars right after #{1,3}\s+, so the patches table
header `## 🛠️ Concrete Patch Suggestions` was never recognized as
a stop boundary, double-counting every finding. Fix: allow
non-letter chars (emoji/space) between # and the keyword.

(2) Grounding check — for each finding row in the response, extract
backtick-quoted symbols + cited line numbers; verify symbols exist
in the actual focus file and lines fall within EOF. Computes
grounded/total ratio per mode. Surfaces 'OOB' (out-of-bounds) count
explicitly so confabulation is visible at a glance. Confirms what
hand-grading found: codereview_playbook_only's 8 findings on
service.rs were 1/8 grounded with 7 OOB.

(3) Control mode tagging — codereview_null and codereview_playbook_only
are designed as falsifiers (baseline / lossy ceiling) and their
numerical wins should never be read as recommendations. Output
marks them with ⚗ glyph + warning footer.

Per-mode aggregate is now sorted by groundedness, not raw count.
Per-mode-vs-lakehouse comparison uses grounded findings, not raw —
so confabulation can no longer score a "win".

Updated SCRUM_MASTER_SPEC.md with refactor timeline pointing at
the 2026-04-25/26 commits (observer fix, relevance filter, retire
wire, mode router, enrichment runner, parameterized experiment).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-26 01:44:21 -05:00

root

e1ef185868

docs: add MATRIX_AGENT_HANDOVER notes + cross-link from SCRUM_MASTER_SPEC

lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts

The matrix-agent-validated repo split + Ansible deploy pipeline are
otherwise undocumented in this repo. This new doc explains:
- the relationship between lakehouse and matrix-agent-validated
- where the playbook lives and what it provisions
- the critical distinction: matrix-test (10.111.129.50 Incus container)
  is the REAL destination, while 192.168.1.145 is a smoke-test VPS only
  (partial deploy, no services, do not treat as production)
- what landed today (observer fix, HANDOVER.md render, relevance filter)

Added a cross-link block at the top of SCRUM_MASTER_SPEC.md so the
canonical scrum handoff doc points at the new MATRIX_AGENT_HANDOVER doc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-25 23:54:42 -05:00

root

fd92a9a0d0

docs: SCRUM_MASTER_SPEC.md — single handoff artifact for the scrum loop

lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts

Fresh-session artifact so work is recoverable if the branch is reopened
in a new Claude Code session without context. Covers:

  - 9-rung ladder (kimi-k2:1t through local qwen3.5:latest)
  - tree-split reducer (files >6KB sharded + map→reduce)
  - schema_v4 KB rows in data/_kb/scrum_reviews.jsonl
  - auto-applier 5 hardened gates (confidence, size, cargo-green,
    warning-count, rationale-diff)
  - pathway_memory (ADR-021) — narrow fingerprint + hot-swap gate +
    semantic-correctness layer (SemanticFlag, BugFingerprint)
  - HTTP surface on gateway (/vectors/pathway/*)
  - current state (12 traces, 11 fingerprints, 0 hot-swaps — probation)
  - commit history on scrum/auto-apply-19814 since iter-5 baseline
  - how-to-run (env vars, service restarts)
  - where things live (code pointers table)
  - known gotchas (LLM Team mode registry, restart requirements)

Paired updates (not in this commit, live outside the repo):
  - /home/profit/CLAUDE.md — active workstream pointer + notes
  - /root/.claude/skills/read-mem/SKILL.md — SCRUM_MASTER_SPEC.md added
    to the loading list + ADR-021 glossary
  - memory/project_scrum_pipeline.md — updated with iter-9 state
  - memory/feedback_semantic_correctness_via_matrix.md — updated with
    end-to-end proof evidence

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-24 06:15:53 -05:00

3 Commits