agent-governance/docs/MEMORY_LAYER.md
profit 77655c298c Initial commit: Agent Governance System Phase 8
Phase 8 Production Hardening with complete governance infrastructure:

- Vault integration with tiered policies (T0-T4)
- DragonflyDB state management
- SQLite audit ledger
- Pipeline DSL and templates
- Promotion/revocation engine
- Checkpoint system for session persistence
- Health manager and circuit breaker for fault tolerance
- GitHub/Slack integrations
- Architectural test pipeline with bug watcher, suggestion engine, council review
- Multi-agent chaos testing framework

Test Results:
- Governance tests: 68/68 passing
- E2E workflow: 16/16 passing
- Phase 2 Vault: 14/14 passing
- Integration tests: 27/27 passing

Coverage: 57.6% average across 12 phases

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 22:07:06 -05:00

283 lines
8.5 KiB
Markdown

# External Memory Layer
> Token-efficient persistent storage for large outputs, transcripts, and context.
## Overview
The External Memory Layer provides a system for storing and retrieving large content outside the token window. Instead of including full outputs in prompts, agents store content in memory and work with summaries + retrieval references.
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Token Window │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Checkpoint │ │ STATUS │ │ Memory References │ │
│ │ Summary │ │ Summaries │ │ [ID] summary (tokens) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ External Memory Layer │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ SQLite │ │ Chunks │ │ DragonflyDB │ │
│ │ (metadata) │ │ (files) │ │ (hot cache, opt) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
```
## Token Thresholds
| Content Size | Storage Strategy |
|-------------|------------------|
| < 500 tokens | Stored inline in database |
| 500-4000 tokens | Stored in compressed file + summary |
| > 4000 tokens | Auto-chunked (multiple files) + parent summary |
## CLI Commands
### Store Content
```bash
# Store inline content
memory log "Test results: all 42 tests passed"
# Store from file
memory log --file /path/to/large-output.txt --tag "test-results"
# Store from stdin (common pattern)
pytest tests/ 2>&1 | memory log --stdin --tag "pytest" --checkpoint ckpt-xxx
# Store with directory linkage
memory log --file output.txt --directory ./pipeline --tag "validation"
```
### Retrieve Content
```bash
# Get full entry (includes content if small, or loads from file)
memory fetch mem-20260123-123456-abcd1234
# Get just the summary (token-efficient)
memory fetch mem-20260123-123456-abcd1234 --summary-only
# Get specific chunk (for large entries)
memory fetch mem-20260123-123456-abcd1234 --chunk 2
```
### List and Search
```bash
# List recent entries
memory list --limit 10
# Filter by type
memory list --type output --limit 20
# Filter by directory
memory list --directory ./tests
# Search content
memory search "error" --limit 5
```
### Memory References
```bash
# Get references linked to a checkpoint
memory refs --checkpoint ckpt-20260123-123456
# Get references for a directory
memory refs --directory ./pipeline
```
### Maintenance
```bash
# Show statistics
memory stats
# Prune old entries
memory prune --keep-days 7 --keep-entries 500
```
## Integration with Checkpoint
When `checkpoint now` runs:
1. Collects references to recent memory entries
2. Includes memory summary (counts, total tokens)
3. Stores lightweight refs instead of full content
```bash
# Checkpoint includes memory refs
checkpoint now --notes "After test run"
# View memory info in checkpoint report
checkpoint report
# Shows:
# [MEMORY REFERENCES]
# mem-xxx: pytest results (12000 tokens)
# mem-yyy: build output (3200 tokens)
```
## Integration with STATUS
STATUS.md files can include memory pointers for detailed context:
```markdown
## Context References
- Test Results: `mem-20260123-123456-abcd` (12000 tokens)
- Build Log: `mem-20260123-123457-efgh` (3200 tokens)
Use `memory fetch <id>` to retrieve full content.
```
## Agent Guidelines
### When to Use Memory
1. **Large outputs** - If output would exceed ~500 tokens, store it
2. **Test results** - Store full test output, reference summary
3. **Build logs** - Store full log, include just errors inline
4. **Generated code** - Store in memory, reference in plan
### Pattern: Store and Reference
```python
# Instead of including large output in response:
# "Here are all 500 lines of test output: ..."
# Do this:
# 1. Store the output
result = subprocess.run(["pytest"], capture_output=True)
# memory log --stdin <<< result.stdout
# 2. Reference it
# "Test completed. Full output stored in mem-xxx (2400 tokens).
# Summary: 42 passed, 3 failed. Failed tests: test_auth, test_db, test_cache"
```
### Pattern: Chunk Retrieval
For very large content (>4000 tokens), memory auto-chunks:
```bash
# Store 50KB log file
memory log --file build.log
# Output: ID: mem-xxx, Chunks: 12
# Retrieve specific chunk
memory fetch mem-xxx --chunk 5
# Or get just the summary
memory fetch mem-xxx --summary-only
```
## Reset/Recovery Workflow
After a context reset or session restart:
### Step 1: Load Checkpoint
```bash
checkpoint load
# Shows: phase, dependencies, memory refs, status summary
```
### Step 2: Check Memory References
```bash
# See what's in memory
memory refs --checkpoint ckpt-latest
# Output:
# mem-abc: pytest results (12000 tokens)
# mem-def: deployment log (8000 tokens)
```
### Step 3: Fetch Needed Context
```bash
# Get summary of test results
memory fetch mem-abc --summary-only
# "42 tests: 40 passed, 2 failed (test_auth, test_db)"
# If needed, get specific chunk
memory fetch mem-abc --chunk 0 # First chunk with failures
```
### Step 4: Resume Work
```bash
# Check directory status
cat ./tests/STATUS.md
# Shows current phase, pending tasks, memory refs
# Continue where you left off
status update ./tests --task "Fixing test_auth failure"
```
## Memory Entry Types
| Type | Purpose | Example |
|------|---------|---------|
| `transcript` | Full conversation logs | Chat history |
| `output` | Command/tool outputs | Test results, build logs |
| `summary` | Generated summaries | Checkpoint summaries |
| `context` | Saved context state | Variables, environment |
| `chunk` | Part of larger entry | Auto-generated |
## Storage Details
### SQLite Database (`memory/memory.db`)
- Entry metadata (ID, type, timestamps)
- Content for small entries
- Summaries
- Links to checkpoints/directories
- Tags for searching
### Chunk Files (`memory/chunks/`)
- Gzip-compressed content
- Named by entry ID
- Auto-pruned after 30 days
### DragonflyDB (optional)
- Hot cache for recent entries
- 1-hour TTL
- Faster retrieval for active work
## Best Practices
1. **Store proactively** - Don't wait for context overflow
2. **Tag consistently** - Use meaningful tags for search
3. **Link to context** - Connect to checkpoints and directories
4. **Use summaries** - Fetch summary first, full content only if needed
5. **Prune regularly** - Keep memory lean with periodic pruning
## Example Session
```bash
# 1. Run tests, store output
pytest tests/ 2>&1 | memory log --stdin --tag "pytest" --tag "integration"
# Stored: mem-20260123-100000-abcd (8500 tokens, 3 chunks)
# 2. Create checkpoint with memory ref
checkpoint now --notes "Integration tests complete"
# 3. Later, after context reset, recover
checkpoint load
# Phase: Testing, Memory: 1 entry (8500 tokens)
memory fetch mem-20260123-100000-abcd --summary-only
# "Integration tests: 156 passed, 2 failed
# Failures: test_oauth_flow (line 234), test_rate_limit (line 567)"
# 4. Get specific failure details
memory fetch mem-20260123-100000-abcd --chunk 1
# (Shows chunk containing the failures)
# 5. Continue work
status update ./tests --task "Fixing test_oauth_flow"
```
---
*Part of the [Agent Governance System](/opt/agent-governance/docs/ARCHITECTURE.md)*