Architectural snapshot of the lakehouse codebase at the point where the
full matrix-driven agent loop with Mem0 versioning + deletion was
validated end-to-end.
WHAT THIS REPO IS
A clean single-commit snapshot of the lakehouse code. Heavy test data
(.parquet datasets, vector indexes) excluded — see REPLICATION.md for
regen path. Full lakehouse history at git.agentview.dev/profit/lakehouse.
WHAT WAS PROVEN
- Vector retrieval across multi-corpora matrix (chicago_permits + entity
briefs + sec_tickers + distilled procedural + llm_team runs)
- Observer hand-review (cloud + heuristic fallback) gating each candidate
- Local-model agent loop (qwen3.5:latest) with tool use + scratchpad
- Playbook seal on success → next-iter retrieval surfaces it as preamble
- Mem0 versioning + deletion in pathway_memory:
* UPSERT: ADD on new workflow, UPDATE bumps replay_count on identical
* REVISE: chains versions, parent.superseded_at + superseded_by stamped
* RETIRE: marks specific trace retired with reason, excluded from retrieval
* HISTORY: walks chain root→tip, cycle-safe
KEY DIRECTORIES
- crates/vectord/src/pathway_memory.rs — Mem0 ops live here
- crates/vectord/src/playbook_memory.rs — original Mem0 reference
- tests/agent_test/ — local-model agent harness + PRD + session archives
- scripts/dump_raw_corpus.sh — MinIO bucket dump (raw test corpus)
- scripts/vectorize_raw_corpus.ts — corpus → vector indexes
- scripts/analyze_chicago_contracts.ts — real inference pipeline
- scripts/seal_agent_playbook.ts — Mem0 upsert from agent traces
Replication: see REPLICATION.md for Debian 13 clean install + cloud-only
adaptation (no local Ollama).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lakehouse PR-Bot
A local sub-agent that reads the PRD, asks a cloud model for a small change proposal, applies it, runs tests, and opens a draft PR on Gitea. Manual-only right now — no systemd, no cron. Run one cycle at a time and watch.
Run one cycle
cd /home/profit/lakehouse
bun run bot/cycle.ts
Prerequisites:
- Working tree must be clean (
git statusshows nothing). The bot refuses on a dirty tree so its changes don't mix with in-flight work. - Sidecar running on :3200, gateway on :3100, observer on :3800.
- At least one PRD line tagged
[bot-eligible](see below). - Gitea PAT configured in
~/.git-credentials(already set up).
Tag a gap as bot-eligible
Add [bot-eligible] to any PRD line the bot is allowed to work on:
- [ ] Add a unit test for parse_city_state covering "South Bend, IN" edge case [bot-eligible]
The bot scans docs/PRD.md for these tags. Each tagged line becomes a candidate.
Start small — one tag at a time — until you trust the loop.
Stop it mid-cycle
Create the pause file:
touch /home/profit/lakehouse/bot.paused
The next bot/cycle.ts invocation exits immediately with skipped_pause.
(It does NOT kill a cycle already in-flight — use Ctrl-C or pkill -f bot/cycle.ts for that.)
Budget
- 20 cloud calls/day, 160k tokens/day (hard ceiling, see
bot/cost.ts). - Tracked in
data/_bot/cost-YYYY-MM-DD.json. - Resets at UTC midnight.
Cycle outcomes
Every run writes a result to data/_bot/cycles/{cycle_id}.json and POSTs an event to the observer on :3800.
Possible outcomes:
| Outcome | Meaning |
|---|---|
ok |
PR opened |
cycle_noop |
proposal applied, every file was identical to current content (Mem0 NOOP); no test run, no PR |
skipped_pause |
pause file present |
skipped_cost |
daily budget exhausted |
skipped_policy |
policy.shouldRunCycle said no |
skipped_dirty_tree |
uncommitted changes present |
skipped_no_gap |
no [bot-eligible] tags in PRD |
model_failed |
sidecar/cloud model errored or returned unparseable JSON |
proposal_rejected |
policy.scoreProposal rejected it (size, path, etc.) |
apply_failed |
file write errored |
tests_failed |
cargo or bun test red |
pr_skipped_by_policy |
green tests, but policy.shouldOpenPR said no |
pr_failed |
Gitea API or git push errored |
Mem0-aligned apply semantics
apply.ts categorizes every proposed file into one of three modes:
| Mode | Trigger | Action |
|---|---|---|
| ADD | is_new: true and file doesn't exist |
Create the file |
| UPDATE | is_new: false, file exists, content differs from current |
Overwrite |
| NOOP | is_new: false, file exists, content matches current exactly |
Skip (no write, no diff) |
If every file in the proposal is NOOP, the cycle short-circuits to cycle_noop before running tests or opening a PR. Mismatched shapes (is_new:true but file exists, or is_new:false but file missing) become apply_failed — model state confusion is surfaced, not papered over.
PR bodies report the three counts separately so reviewers see what actually changed vs. what was confirmed identical.
How the bot compounds over cycles
Every finished cycle persists a CycleResult to data/_bot/cycles/{cycle_id}.json. At the start of the next cycle, bot/kb.ts::loadHistory(gap_id) scans that directory, filters to prior cycles on the same gap, and returns the five most recent outcomes (ok, tests_failed, proposal_rejected, cycle_noop, etc.) with their reasons and touched files.
Those outcomes are summarized into a compact block and injected into the cloud prompt before asking for a new proposal. The model sees things like:
Prior attempts on this gap (3 most recent):
- 2026-04-22 03:15 UTC — tests_failed
reason: cargo test -p vectord::lance failed on field_type_coerce
- 2026-04-22 02:48 UTC — proposal_rejected
reason: touched forbidden path (docs/ADR-): docs/ADR-019-update.md
- 2026-04-22 01:23 UTC — ok PR: https://git.agentview.dev/profit/lakehouse/pulls/142
reason: PR #142 opened
Learn from these: build on what worked, avoid paths that failed.
This is the same compounding pattern scenario.ts uses via kb.loadRecommendation — the bot's cycles are the bot's memory. No embedding, no separate jsonl, no cross-run orchestration required. First cycle on a new gap skips the block cleanly.
The observer events POSTed on every cycle carry the same data, so GET :3800/stats?source=bot aggregates "% of bot cycles on similar gaps that landed a PR" without extra plumbing.
Where YOU edit
bot/policy.ts — four small functions that define what the bot does. The rest
(propose.ts, apply.ts, test.ts, pr.ts, cycle.ts, kb.ts) is mechanical
orchestration — you shouldn't need to touch it unless a pipeline changes.
One policy upgrade the history opens up: in shouldRunCycle, you can now bail
out if the same gap has failed N times in a row. Example addition:
import { loadHistory, statsFor } from "./kb.ts";
// inside shouldRunCycle — but you'd need the gap, which is picked later.
// A more natural place is scoreProposal: if prior N failures on this
// gap's path+summary pattern, reject before testing.
Hard-coded guardrails (not in policy — can't be disabled)
- Bot never deletes files (
apply.tshas no delete path). - Bot never touches
.git/,secrets,lakehouse.toml,docs/ADR-*,docs/DECISIONS.md,docs/PRD.md,/etc/,/root/,Cargo.lock. - All paths validated for traversal (
path/../) and repo-escape. - PRs always open as draft — never auto-merge.
- Budget check is before the policy function runs — no way to override the daily cap from
policy.ts.