Phase 8 Production Hardening with complete governance infrastructure: - Vault integration with tiered policies (T0-T4) - DragonflyDB state management - SQLite audit ledger - Pipeline DSL and templates - Promotion/revocation engine - Checkpoint system for session persistence - Health manager and circuit breaker for fault tolerance - GitHub/Slack integrations - Architectural test pipeline with bug watcher, suggestion engine, council review - Multi-agent chaos testing framework Test Results: - Governance tests: 68/68 passing - E2E workflow: 16/16 passing - Phase 2 Vault: 14/14 passing - Integration tests: 27/27 passing Coverage: 57.6% average across 12 phases Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
12 KiB
Context Management System
Unified guide to checkpoints, STATUS files, and the external memory layer.
Overview
The Agent Governance System uses three integrated components to maintain context across long-running sessions while staying within token limits:
| Component | Purpose | Token Impact |
|---|---|---|
| Checkpoints | Capture full session state at points in time | ~3000 tokens |
| STATUS Files | Track per-directory progress and tasks | ~50 tokens each |
| Memory Layer | Store large outputs externally with summaries | Minimal (refs only) |
┌─────────────────────────────────────────────────────────────────────┐
│ CONTEXT MANAGEMENT SYSTEM │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │ Checkpoint │◄───│ STATUS │◄───│ Memory Layer │ │
│ │ │ │ Files │ │ │ │
│ │ • Phase │ │ │ │ • Large outputs │ │
│ │ • Tasks │ │ • Per-dir │ │ • Auto-chunked │ │
│ │ • Deps │ │ • Progress │ │ • Summarized │ │
│ │ • Memory refs│ │ • Tasks │ │ • Searchable │ │
│ │ • Status sum │ │ • Memory ptr │ │ │ │
│ └──────────────┘ └──────────────┘ └──────────────────────┘ │
│ │ │ │ │
│ └───────────────────┴──────────────────────┘ │
│ │ │
│ ┌─────────▼─────────┐ │
│ │ Token Window │ │
│ │ (Summaries + │ │
│ │ References) │ │
│ └───────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
Quick Reference
CLI Commands
# Checkpoints
checkpoint now --notes "Description" # Create checkpoint
checkpoint load # Load latest
checkpoint report # Combined status view
checkpoint timeline # History view
# Status
status sweep # Check all directories
status sweep --fix # Create missing files
status update <dir> --phase <phase> # Update directory
status dashboard # Overview
# Memory
memory log "content" # Store content
memory log --stdin # Store from pipe
memory fetch <id> # Retrieve full
memory fetch <id> -s # Summary only
memory list # List entries
memory search "query" # Search
memory stats # Statistics
Component Details
1. Checkpoints
Checkpoints capture complete session state at a point in time.
What's Captured:
- Project phase and status
- Completed phases list
- Active tasks
- Dependency states (Vault, DragonflyDB, Ledger)
- Directory status summary (aggregated from STATUS files)
- Memory references (links to stored content)
- Variables and recent outputs
When to Create:
- After completing significant work
- Before context-heavy operations
- At natural stopping points
- Automatically when updating STATUS (creates delta)
Commands:
# Create with notes
checkpoint now --notes "Completed pipeline consolidation"
# Load latest
checkpoint load
# Load specific
checkpoint load ckpt-20260123-123456-abcd
# View combined report
checkpoint report
# View history
checkpoint timeline --limit 10
# Compare checkpoints
checkpoint diff --from ckpt-aaa --to ckpt-bbb
Data Location: /opt/agent-governance/checkpoint/storage/
2. STATUS Files
Every directory has README.md and STATUS.md for local context.
README.md Contains:
- Directory purpose
- Key files and their roles
- Interfaces/APIs
- Current status badge
- Link to parent architecture
STATUS.md Contains:
- Current phase (complete/in_progress/blocked/needs_review)
- Task checklist
- Dependencies
- Issues/blockers
- Activity log with timestamps
Phase Values:
| Phase | Icon | Meaning |
|---|---|---|
complete |
✅ | Work finished |
in_progress |
🚧 | Active development |
blocked |
❗ | Waiting on dependencies |
needs_review |
⚠️ | Requires attention |
not_started |
⬜ | No work begun |
Commands:
# Check all directories
status sweep
# Auto-create missing files
status sweep --fix
# Update a directory
status update ./pipeline --phase complete --task "Unified with core.py"
# View dashboard
status dashboard
# Initialize new directory
status init ./new-module
Integration with Checkpoints:
status updateautomatically creates a checkpoint delta- Checkpoints aggregate status from all directories
- Use
--no-checkpointto skip delta creation
3. Memory Layer
External storage for large content that would overwhelm the token window.
Token Thresholds:
| Size | Strategy |
|---|---|
| < 500 tokens | Inline in database |
| 500-4000 tokens | File + summary |
| > 4000 tokens | Auto-chunked + summary |
Entry Types:
| Type | Purpose |
|---|---|
output |
Command outputs, logs |
transcript |
Conversation logs |
context |
Saved state |
chunk |
Part of larger entry |
Commands:
# Store content
memory log "Test output here"
memory log --file results.txt --tag "tests"
pytest 2>&1 | memory log --stdin --tag "pytest"
# Retrieve
memory fetch mem-xxx # Full content
memory fetch mem-xxx -s # Summary only
memory fetch mem-xxx --chunk 2 # Specific chunk
# Browse
memory list --type output --limit 10
memory search "error"
memory refs --checkpoint ckpt-xxx
# Maintain
memory stats
memory prune --keep-days 7
Data Location:
- Metadata:
/opt/agent-governance/memory/memory.db - Chunks:
/opt/agent-governance/memory/chunks/
Integration Patterns
Pattern 1: Store Large Output
When a command produces large output:
# Instead of printing 10000 lines to chat
pytest tests/ 2>&1 | memory log --stdin --tag "pytest" --directory ./tests
# Report summary in chat
echo "Tests complete. Results stored in mem-xxx (8500 tokens)."
echo "Summary: 156 passed, 2 failed (test_auth, test_db)"
Pattern 2: Link Memory to Status
In STATUS.md, reference stored content:
## Context References
- Full test output: `mem-20260123-100000-abcd` (8500 tokens)
- Build log: `mem-20260123-100001-efgh` (3200 tokens)
Retrieve with: `memory fetch <id> --summary-only`
Pattern 3: Checkpoint with Memory
Checkpoints automatically include memory references:
# Create checkpoint after storing output
memory log --file build.log --tag "build"
checkpoint now --notes "Build complete"
# Later, checkpoint report shows memory refs
checkpoint report
# [MEMORY]
# 3 entries, 12000 total tokens
Pattern 4: Recovery Workflow
After context reset:
# 1. Load checkpoint
checkpoint load
# Shows: phase, deps, status summary, memory refs
# 2. Check directory status
checkpoint report
# Shows: active directories, completion %
# 3. Get memory summaries
memory list --limit 5
memory fetch mem-xxx -s # Summary only
# 4. Fetch specific chunk if needed
memory fetch mem-xxx --chunk 2
# 5. Resume work
status update ./current-dir --task "Resuming: fix auth bug"
Token Efficiency
Before Memory Layer
Token Window:
├── System prompt: 2000 tokens
├── Conversation: 5000 tokens
├── Large output: 15000 tokens ← Problem!
└── Total: 22000 tokens (overflow risk)
With Memory Layer
Token Window:
├── System prompt: 2000 tokens
├── Conversation: 5000 tokens
├── Memory reference: 50 tokens ← "mem-xxx: test results (15000 tokens)"
├── Summary: 200 tokens ← "156 passed, 2 failed..."
└── Total: 7250 tokens (safe)
External Memory:
└── Full content: 15000 tokens (retrievable on demand)
Token Budget Guidelines
| Content | Recommended Action |
|---|---|
| < 500 tokens | Include inline |
| 500-2000 tokens | Consider storing, include summary |
| > 2000 tokens | Always store, use summary + refs |
| > 10000 tokens | Store chunked, fetch chunks as needed |
File Locations
/opt/agent-governance/
├── checkpoint/
│ ├── checkpoint.py # Checkpoint manager
│ ├── storage/ # Checkpoint JSON files
│ └── templates/ # Summary templates
├── memory/
│ ├── memory.py # Memory manager
│ ├── memory.db # SQLite metadata
│ ├── chunks/ # Compressed content files
│ └── summaries/ # Generated summaries
├── bin/
│ ├── checkpoint # CLI wrapper
│ ├── status # CLI wrapper
│ └── memory # CLI wrapper
├── docs/
│ ├── CONTEXT_MANAGEMENT.md # This file
│ ├── MEMORY_LAYER.md # Memory details
│ └── STATUS_PROTOCOL.md # Status details
└── */
├── README.md # Directory overview
└── STATUS.md # Directory status
Best Practices
For Checkpoints
- Create checkpoints at natural stopping points
- Include descriptive notes
- Use
checkpoint reportto verify state - Review timeline before resuming work
For STATUS Files
- Update status when entering/leaving directories
- Keep task lists current
- Document blockers immediately
- Link to relevant memory entries
For Memory
- Store proactively (before context fills)
- Use meaningful tags for searchability
- Fetch summaries first, full content only when needed
- Prune regularly to manage storage
For Recovery
- Always start with
checkpoint load - Use
checkpoint reportfor overview - Check
status dashboardfor active work - Fetch memory summaries before full content
Troubleshooting
"Checkpoint not found"
checkpoint list # See available checkpoints
"Memory entry not found"
memory list --limit 50 # Browse recent entries
memory search "keyword" # Search by content
"STATUS.md missing"
status sweep --fix # Auto-create missing files
"Token window overflow"
# Store large content externally
memory log --stdin < large-output.txt
# Use summary instead of full content
memory fetch mem-xxx -s
Part of the Agent Governance System