Phase 8 Production Hardening with complete governance infrastructure: - Vault integration with tiered policies (T0-T4) - DragonflyDB state management - SQLite audit ledger - Pipeline DSL and templates - Promotion/revocation engine - Checkpoint system for session persistence - Health manager and circuit breaker for fault tolerance - GitHub/Slack integrations - Architectural test pipeline with bug watcher, suggestion engine, council review - Multi-agent chaos testing framework Test Results: - Governance tests: 68/68 passing - E2E workflow: 16/16 passing - Phase 2 Vault: 14/14 passing - Integration tests: 27/27 passing Coverage: 57.6% average across 12 phases Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
401 lines
12 KiB
Markdown
401 lines
12 KiB
Markdown
# Context Management System
|
|
|
|
> Unified guide to checkpoints, STATUS files, and the external memory layer.
|
|
|
|
## Overview
|
|
|
|
The Agent Governance System uses three integrated components to maintain context across long-running sessions while staying within token limits:
|
|
|
|
| Component | Purpose | Token Impact |
|
|
|-----------|---------|--------------|
|
|
| **Checkpoints** | Capture full session state at points in time | ~3000 tokens |
|
|
| **STATUS Files** | Track per-directory progress and tasks | ~50 tokens each |
|
|
| **Memory Layer** | Store large outputs externally with summaries | Minimal (refs only) |
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────┐
|
|
│ CONTEXT MANAGEMENT SYSTEM │
|
|
├─────────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
|
|
│ │ Checkpoint │◄───│ STATUS │◄───│ Memory Layer │ │
|
|
│ │ │ │ Files │ │ │ │
|
|
│ │ • Phase │ │ │ │ • Large outputs │ │
|
|
│ │ • Tasks │ │ • Per-dir │ │ • Auto-chunked │ │
|
|
│ │ • Deps │ │ • Progress │ │ • Summarized │ │
|
|
│ │ • Memory refs│ │ • Tasks │ │ • Searchable │ │
|
|
│ │ • Status sum │ │ • Memory ptr │ │ │ │
|
|
│ └──────────────┘ └──────────────┘ └──────────────────────┘ │
|
|
│ │ │ │ │
|
|
│ └───────────────────┴──────────────────────┘ │
|
|
│ │ │
|
|
│ ┌─────────▼─────────┐ │
|
|
│ │ Token Window │ │
|
|
│ │ (Summaries + │ │
|
|
│ │ References) │ │
|
|
│ └───────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Quick Reference
|
|
|
|
### CLI Commands
|
|
|
|
```bash
|
|
# Checkpoints
|
|
checkpoint now --notes "Description" # Create checkpoint
|
|
checkpoint load # Load latest
|
|
checkpoint report # Combined status view
|
|
checkpoint timeline # History view
|
|
|
|
# Status
|
|
status sweep # Check all directories
|
|
status sweep --fix # Create missing files
|
|
status update <dir> --phase <phase> # Update directory
|
|
status dashboard # Overview
|
|
|
|
# Memory
|
|
memory log "content" # Store content
|
|
memory log --stdin # Store from pipe
|
|
memory fetch <id> # Retrieve full
|
|
memory fetch <id> -s # Summary only
|
|
memory list # List entries
|
|
memory search "query" # Search
|
|
memory stats # Statistics
|
|
```
|
|
|
|
## Component Details
|
|
|
|
### 1. Checkpoints
|
|
|
|
Checkpoints capture complete session state at a point in time.
|
|
|
|
**What's Captured:**
|
|
- Project phase and status
|
|
- Completed phases list
|
|
- Active tasks
|
|
- Dependency states (Vault, DragonflyDB, Ledger)
|
|
- Directory status summary (aggregated from STATUS files)
|
|
- Memory references (links to stored content)
|
|
- Variables and recent outputs
|
|
|
|
**When to Create:**
|
|
- After completing significant work
|
|
- Before context-heavy operations
|
|
- At natural stopping points
|
|
- Automatically when updating STATUS (creates delta)
|
|
|
|
**Commands:**
|
|
```bash
|
|
# Create with notes
|
|
checkpoint now --notes "Completed pipeline consolidation"
|
|
|
|
# Load latest
|
|
checkpoint load
|
|
|
|
# Load specific
|
|
checkpoint load ckpt-20260123-123456-abcd
|
|
|
|
# View combined report
|
|
checkpoint report
|
|
|
|
# View history
|
|
checkpoint timeline --limit 10
|
|
|
|
# Compare checkpoints
|
|
checkpoint diff --from ckpt-aaa --to ckpt-bbb
|
|
```
|
|
|
|
**Data Location:** `/opt/agent-governance/checkpoint/storage/`
|
|
|
|
---
|
|
|
|
### 2. STATUS Files
|
|
|
|
Every directory has `README.md` and `STATUS.md` for local context.
|
|
|
|
**README.md Contains:**
|
|
- Directory purpose
|
|
- Key files and their roles
|
|
- Interfaces/APIs
|
|
- Current status badge
|
|
- Link to parent architecture
|
|
|
|
**STATUS.md Contains:**
|
|
- Current phase (complete/in_progress/blocked/needs_review)
|
|
- Task checklist
|
|
- Dependencies
|
|
- Issues/blockers
|
|
- Activity log with timestamps
|
|
|
|
**Phase Values:**
|
|
| Phase | Icon | Meaning |
|
|
|-------|------|---------|
|
|
| `complete` | ✅ | Work finished |
|
|
| `in_progress` | 🚧 | Active development |
|
|
| `blocked` | ❗ | Waiting on dependencies |
|
|
| `needs_review` | ⚠️ | Requires attention |
|
|
| `not_started` | ⬜ | No work begun |
|
|
|
|
**Commands:**
|
|
```bash
|
|
# Check all directories
|
|
status sweep
|
|
|
|
# Auto-create missing files
|
|
status sweep --fix
|
|
|
|
# Update a directory
|
|
status update ./pipeline --phase complete --task "Unified with core.py"
|
|
|
|
# View dashboard
|
|
status dashboard
|
|
|
|
# Initialize new directory
|
|
status init ./new-module
|
|
```
|
|
|
|
**Integration with Checkpoints:**
|
|
- `status update` automatically creates a checkpoint delta
|
|
- Checkpoints aggregate status from all directories
|
|
- Use `--no-checkpoint` to skip delta creation
|
|
|
|
---
|
|
|
|
### 3. Memory Layer
|
|
|
|
External storage for large content that would overwhelm the token window.
|
|
|
|
**Token Thresholds:**
|
|
| Size | Strategy |
|
|
|------|----------|
|
|
| < 500 tokens | Inline in database |
|
|
| 500-4000 tokens | File + summary |
|
|
| > 4000 tokens | Auto-chunked + summary |
|
|
|
|
**Entry Types:**
|
|
| Type | Purpose |
|
|
|------|---------|
|
|
| `output` | Command outputs, logs |
|
|
| `transcript` | Conversation logs |
|
|
| `context` | Saved state |
|
|
| `chunk` | Part of larger entry |
|
|
|
|
**Commands:**
|
|
```bash
|
|
# Store content
|
|
memory log "Test output here"
|
|
memory log --file results.txt --tag "tests"
|
|
pytest 2>&1 | memory log --stdin --tag "pytest"
|
|
|
|
# Retrieve
|
|
memory fetch mem-xxx # Full content
|
|
memory fetch mem-xxx -s # Summary only
|
|
memory fetch mem-xxx --chunk 2 # Specific chunk
|
|
|
|
# Browse
|
|
memory list --type output --limit 10
|
|
memory search "error"
|
|
memory refs --checkpoint ckpt-xxx
|
|
|
|
# Maintain
|
|
memory stats
|
|
memory prune --keep-days 7
|
|
```
|
|
|
|
**Data Location:**
|
|
- Metadata: `/opt/agent-governance/memory/memory.db`
|
|
- Chunks: `/opt/agent-governance/memory/chunks/`
|
|
|
|
---
|
|
|
|
## Integration Patterns
|
|
|
|
### Pattern 1: Store Large Output
|
|
|
|
When a command produces large output:
|
|
|
|
```bash
|
|
# Instead of printing 10000 lines to chat
|
|
pytest tests/ 2>&1 | memory log --stdin --tag "pytest" --directory ./tests
|
|
|
|
# Report summary in chat
|
|
echo "Tests complete. Results stored in mem-xxx (8500 tokens)."
|
|
echo "Summary: 156 passed, 2 failed (test_auth, test_db)"
|
|
```
|
|
|
|
### Pattern 2: Link Memory to Status
|
|
|
|
In STATUS.md, reference stored content:
|
|
|
|
```markdown
|
|
## Context References
|
|
|
|
- Full test output: `mem-20260123-100000-abcd` (8500 tokens)
|
|
- Build log: `mem-20260123-100001-efgh` (3200 tokens)
|
|
|
|
Retrieve with: `memory fetch <id> --summary-only`
|
|
```
|
|
|
|
### Pattern 3: Checkpoint with Memory
|
|
|
|
Checkpoints automatically include memory references:
|
|
|
|
```bash
|
|
# Create checkpoint after storing output
|
|
memory log --file build.log --tag "build"
|
|
checkpoint now --notes "Build complete"
|
|
|
|
# Later, checkpoint report shows memory refs
|
|
checkpoint report
|
|
# [MEMORY]
|
|
# 3 entries, 12000 total tokens
|
|
```
|
|
|
|
### Pattern 4: Recovery Workflow
|
|
|
|
After context reset:
|
|
|
|
```bash
|
|
# 1. Load checkpoint
|
|
checkpoint load
|
|
# Shows: phase, deps, status summary, memory refs
|
|
|
|
# 2. Check directory status
|
|
checkpoint report
|
|
# Shows: active directories, completion %
|
|
|
|
# 3. Get memory summaries
|
|
memory list --limit 5
|
|
memory fetch mem-xxx -s # Summary only
|
|
|
|
# 4. Fetch specific chunk if needed
|
|
memory fetch mem-xxx --chunk 2
|
|
|
|
# 5. Resume work
|
|
status update ./current-dir --task "Resuming: fix auth bug"
|
|
```
|
|
|
|
---
|
|
|
|
## Token Efficiency
|
|
|
|
### Before Memory Layer
|
|
```
|
|
Token Window:
|
|
├── System prompt: 2000 tokens
|
|
├── Conversation: 5000 tokens
|
|
├── Large output: 15000 tokens ← Problem!
|
|
└── Total: 22000 tokens (overflow risk)
|
|
```
|
|
|
|
### With Memory Layer
|
|
```
|
|
Token Window:
|
|
├── System prompt: 2000 tokens
|
|
├── Conversation: 5000 tokens
|
|
├── Memory reference: 50 tokens ← "mem-xxx: test results (15000 tokens)"
|
|
├── Summary: 200 tokens ← "156 passed, 2 failed..."
|
|
└── Total: 7250 tokens (safe)
|
|
|
|
External Memory:
|
|
└── Full content: 15000 tokens (retrievable on demand)
|
|
```
|
|
|
|
### Token Budget Guidelines
|
|
|
|
| Content | Recommended Action |
|
|
|---------|-------------------|
|
|
| < 500 tokens | Include inline |
|
|
| 500-2000 tokens | Consider storing, include summary |
|
|
| > 2000 tokens | Always store, use summary + refs |
|
|
| > 10000 tokens | Store chunked, fetch chunks as needed |
|
|
|
|
---
|
|
|
|
## File Locations
|
|
|
|
```
|
|
/opt/agent-governance/
|
|
├── checkpoint/
|
|
│ ├── checkpoint.py # Checkpoint manager
|
|
│ ├── storage/ # Checkpoint JSON files
|
|
│ └── templates/ # Summary templates
|
|
├── memory/
|
|
│ ├── memory.py # Memory manager
|
|
│ ├── memory.db # SQLite metadata
|
|
│ ├── chunks/ # Compressed content files
|
|
│ └── summaries/ # Generated summaries
|
|
├── bin/
|
|
│ ├── checkpoint # CLI wrapper
|
|
│ ├── status # CLI wrapper
|
|
│ └── memory # CLI wrapper
|
|
├── docs/
|
|
│ ├── CONTEXT_MANAGEMENT.md # This file
|
|
│ ├── MEMORY_LAYER.md # Memory details
|
|
│ └── STATUS_PROTOCOL.md # Status details
|
|
└── */
|
|
├── README.md # Directory overview
|
|
└── STATUS.md # Directory status
|
|
```
|
|
|
|
---
|
|
|
|
## Best Practices
|
|
|
|
### For Checkpoints
|
|
1. Create checkpoints at natural stopping points
|
|
2. Include descriptive notes
|
|
3. Use `checkpoint report` to verify state
|
|
4. Review timeline before resuming work
|
|
|
|
### For STATUS Files
|
|
1. Update status when entering/leaving directories
|
|
2. Keep task lists current
|
|
3. Document blockers immediately
|
|
4. Link to relevant memory entries
|
|
|
|
### For Memory
|
|
1. Store proactively (before context fills)
|
|
2. Use meaningful tags for searchability
|
|
3. Fetch summaries first, full content only when needed
|
|
4. Prune regularly to manage storage
|
|
|
|
### For Recovery
|
|
1. Always start with `checkpoint load`
|
|
2. Use `checkpoint report` for overview
|
|
3. Check `status dashboard` for active work
|
|
4. Fetch memory summaries before full content
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### "Checkpoint not found"
|
|
```bash
|
|
checkpoint list # See available checkpoints
|
|
```
|
|
|
|
### "Memory entry not found"
|
|
```bash
|
|
memory list --limit 50 # Browse recent entries
|
|
memory search "keyword" # Search by content
|
|
```
|
|
|
|
### "STATUS.md missing"
|
|
```bash
|
|
status sweep --fix # Auto-create missing files
|
|
```
|
|
|
|
### "Token window overflow"
|
|
```bash
|
|
# Store large content externally
|
|
memory log --stdin < large-output.txt
|
|
# Use summary instead of full content
|
|
memory fetch mem-xxx -s
|
|
```
|
|
|
|
---
|
|
|
|
*Part of the [Agent Governance System](/opt/agent-governance/docs/ARCHITECTURE.md)*
|