golangLAKEHOUSE/docs/SCRUM.md

# Scrum Test: Matrix Agent Validated Hardening Sprint

## Mission

Run a Scrum-style technical validation against this repository:

https://git.agentview.dev/profit/matrix-agent-validated.git

Do not add features first. Treat the codebase as a validated prototype that now needs production-hardening pressure.

The goal is to produce a hard evidence report and a prioritized sprint backlog.

## Core Questions

1. Can the repo be cloned, built, and smoke-tested from a clean environment?
2. Are the claimed validated paths actually covered by repeatable tests?
3. Where does the system rely on demo assumptions, hardcoded paths, permissive fallbacks, or unsafe string construction?
4. Which failures would corrupt trust in the agent loop?
5. What must be fixed before this becomes a reusable agent-memory substrate?

## Required Inspection Areas

### 1. Build and Test Surface

Inspect:

- Cargo workspace
- Rust crates
- Bun/TypeScript MCP server
- Python sidecar
- tests/
- justfile
- REPLICATION.md
- systemd units
- scripts/

Run or prepare the following commands where possible:

```bash
just --list
cargo check --workspace
cargo test --workspace
cd mcp-server && bun install && bun test || true
bun run tests/agent_test/agent_harness.ts || true

If heavy data or external services are missing, do not fake success. Record the blocker and define a mock/minimal fixture path.

2. Security and Trust Boundary Review

Search for:

raw SQL interpolation
shell command execution
open CORS
unauthenticated mutation endpoints
pass-through proxy routes
hardcoded absolute paths
secrets in repo
fail-open review behavior
unbounded file reads/writes
unsafe JSON parsing assumptions

Pay special attention to:

mcp-server/index.ts
mcp-server/observer.ts
crates/vectord/src/pathway_memory.rs
crates/vectord/src/playbook_memory.rs
scripts/
sidecar/
3. Agent Validation Review

Verify whether the following claims are actually enforced by tests:

vector retrieval across corpora
observer hand-review gates candidates
successful playbooks are sealed
retrieval surfaces prior playbooks on later runs
Mem0-style ADD / UPDATE / REVISE / RETIRE / HISTORY behavior works
retired traces are excluded from retrieval
history chains are cycle-safe
agent claims can be verified against SQL truth
cloud-only adaptation works without local Ollama

Create a table:

Claim	Code Location	Existing Test	Missing Test	Risk
4. Scrum Backlog Output

Create a prioritized backlog using this format:

Sprint 0 — Reproducibility Gate

Goal: make the repo provably runnable.

Stories:

As an operator, I can run one command and know which dependencies are missing.
As an operator, I can run a minimal fixture test without the 470MB data payload.
As an operator, I can verify gateway, sidecar, observer, and MCP health with one command.

Acceptance:

just verify exists.
just smoke runs without large datasets.
failure output is structured JSON.
no test claims success when dependencies are missing.
Sprint 1 — Trust Boundary Gate

Goal: prevent agent trust collapse.

Stories:

Replace raw SQL string interpolation with validated query builders or parameterized calls.
Change observer /review failure from fail-open accept to explicit degraded/cycle verdict.
Add auth or localhost-only guardrails for mutation endpoints.
Add schema validation for every public endpoint.

Acceptance:

SQL injection tests fail before fix and pass after fix.
observer crash cannot auto-accept unsafe candidate output.
mutation endpoints require configured token or local-only mode.
Sprint 2 — Memory Correctness Gate

Goal: prove Mem0/pathway memory cannot poison itself.

Stories:

Add tests for ADD, UPDATE, REVISE, RETIRE, HISTORY.
Add cycle detection tests.
Add retired-trace retrieval exclusion tests.
Add duplicate trace replay_count tests.
Add corrupted memory row recovery test.

Acceptance:

deterministic fixture corpus
all memory operations covered
every memory mutation emits audit/event receipt
Sprint 3 — Agent Loop Reality Gate

Goal: prove the agent loop works across actual workflows.

Stories:

Build deterministic mini corpus.
Run search → verify → observer review → playbook seal → second-run retrieval.
Add negative case where observer rejects hallucinated claim.
Add regression for health endpoint content-type mismatch.

Acceptance:

single command proves the full loop
generated report includes input hash, output hash, verdict, and memory mutation receipt
Sprint 4 — Deployment Gate

Goal: turn REPLICATION.md into executable deployment validation.

Stories:

Convert REPLICATION.md validation section into scripts.
Add env var template.
Add config validation.
Remove hardcoded /home/profit/lakehouse paths.
Add systemd readiness checks.

Acceptance:

fresh clone can run just doctor
missing env vars are reported clearly
no absolute path assumptions remain unless configured
Required Final Deliverables

Create:

reports/scrum/matrix-agent-scrum-test.md
reports/scrum/risk-register.md
reports/scrum/claim-coverage-table.md
reports/scrum/sprint-backlog.md
reports/scrum/acceptance-gates.md

Do not rewrite the system yet.

First produce the reports only.

Scoring Model

Use this scoring:

Reproducibility: 0–10
Test Coverage: 0–10
Trust Boundary Safety: 0–10
Agent Memory Correctness: 0–10
Deployment Readiness: 0–10
Maintainability: 0–10

Mark each score with evidence.

Final Rule

No vibes. No “appears to work.” Every claim must point to:

file path
line/function
command output
test result
missing evidence

That’s the move: **don’t refactor yet. Put the repo under oath first.**
::contentReference[oaicite:5]{index=5}