agent-governance/docs/CONTEXT_MANAGEMENT.md
profit 77655c298c Initial commit: Agent Governance System Phase 8
Phase 8 Production Hardening with complete governance infrastructure:

- Vault integration with tiered policies (T0-T4)
- DragonflyDB state management
- SQLite audit ledger
- Pipeline DSL and templates
- Promotion/revocation engine
- Checkpoint system for session persistence
- Health manager and circuit breaker for fault tolerance
- GitHub/Slack integrations
- Architectural test pipeline with bug watcher, suggestion engine, council review
- Multi-agent chaos testing framework

Test Results:
- Governance tests: 68/68 passing
- E2E workflow: 16/16 passing
- Phase 2 Vault: 14/14 passing
- Integration tests: 27/27 passing

Coverage: 57.6% average across 12 phases

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 22:07:06 -05:00

12 KiB

Context Management System

Unified guide to checkpoints, STATUS files, and the external memory layer.

Overview

The Agent Governance System uses three integrated components to maintain context across long-running sessions while staying within token limits:

Component Purpose Token Impact
Checkpoints Capture full session state at points in time ~3000 tokens
STATUS Files Track per-directory progress and tasks ~50 tokens each
Memory Layer Store large outputs externally with summaries Minimal (refs only)
┌─────────────────────────────────────────────────────────────────────┐
│                    CONTEXT MANAGEMENT SYSTEM                         │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   ┌──────────────┐    ┌──────────────┐    ┌──────────────────────┐  │
│   │  Checkpoint  │◄───│    STATUS    │◄───│   Memory Layer       │  │
│   │              │    │    Files     │    │                      │  │
│   │ • Phase      │    │              │    │ • Large outputs      │  │
│   │ • Tasks      │    │ • Per-dir    │    │ • Auto-chunked       │  │
│   │ • Deps       │    │ • Progress   │    │ • Summarized         │  │
│   │ • Memory refs│    │ • Tasks      │    │ • Searchable         │  │
│   │ • Status sum │    │ • Memory ptr │    │                      │  │
│   └──────────────┘    └──────────────┘    └──────────────────────┘  │
│          │                   │                      │               │
│          └───────────────────┴──────────────────────┘               │
│                              │                                       │
│                    ┌─────────▼─────────┐                            │
│                    │   Token Window    │                            │
│                    │   (Summaries +    │                            │
│                    │    References)    │                            │
│                    └───────────────────┘                            │
└─────────────────────────────────────────────────────────────────────┘

Quick Reference

CLI Commands

# Checkpoints
checkpoint now --notes "Description"     # Create checkpoint
checkpoint load                          # Load latest
checkpoint report                        # Combined status view
checkpoint timeline                      # History view

# Status
status sweep                             # Check all directories
status sweep --fix                       # Create missing files
status update <dir> --phase <phase>      # Update directory
status dashboard                         # Overview

# Memory
memory log "content"                     # Store content
memory log --stdin                       # Store from pipe
memory fetch <id>                        # Retrieve full
memory fetch <id> -s                     # Summary only
memory list                              # List entries
memory search "query"                    # Search
memory stats                             # Statistics

Component Details

1. Checkpoints

Checkpoints capture complete session state at a point in time.

What's Captured:

  • Project phase and status
  • Completed phases list
  • Active tasks
  • Dependency states (Vault, DragonflyDB, Ledger)
  • Directory status summary (aggregated from STATUS files)
  • Memory references (links to stored content)
  • Variables and recent outputs

When to Create:

  • After completing significant work
  • Before context-heavy operations
  • At natural stopping points
  • Automatically when updating STATUS (creates delta)

Commands:

# Create with notes
checkpoint now --notes "Completed pipeline consolidation"

# Load latest
checkpoint load

# Load specific
checkpoint load ckpt-20260123-123456-abcd

# View combined report
checkpoint report

# View history
checkpoint timeline --limit 10

# Compare checkpoints
checkpoint diff --from ckpt-aaa --to ckpt-bbb

Data Location: /opt/agent-governance/checkpoint/storage/


2. STATUS Files

Every directory has README.md and STATUS.md for local context.

README.md Contains:

  • Directory purpose
  • Key files and their roles
  • Interfaces/APIs
  • Current status badge
  • Link to parent architecture

STATUS.md Contains:

  • Current phase (complete/in_progress/blocked/needs_review)
  • Task checklist
  • Dependencies
  • Issues/blockers
  • Activity log with timestamps

Phase Values:

Phase Icon Meaning
complete Work finished
in_progress 🚧 Active development
blocked Waiting on dependencies
needs_review ⚠️ Requires attention
not_started No work begun

Commands:

# Check all directories
status sweep

# Auto-create missing files
status sweep --fix

# Update a directory
status update ./pipeline --phase complete --task "Unified with core.py"

# View dashboard
status dashboard

# Initialize new directory
status init ./new-module

Integration with Checkpoints:

  • status update automatically creates a checkpoint delta
  • Checkpoints aggregate status from all directories
  • Use --no-checkpoint to skip delta creation

3. Memory Layer

External storage for large content that would overwhelm the token window.

Token Thresholds:

Size Strategy
< 500 tokens Inline in database
500-4000 tokens File + summary
> 4000 tokens Auto-chunked + summary

Entry Types:

Type Purpose
output Command outputs, logs
transcript Conversation logs
context Saved state
chunk Part of larger entry

Commands:

# Store content
memory log "Test output here"
memory log --file results.txt --tag "tests"
pytest 2>&1 | memory log --stdin --tag "pytest"

# Retrieve
memory fetch mem-xxx                    # Full content
memory fetch mem-xxx -s                 # Summary only
memory fetch mem-xxx --chunk 2          # Specific chunk

# Browse
memory list --type output --limit 10
memory search "error"
memory refs --checkpoint ckpt-xxx

# Maintain
memory stats
memory prune --keep-days 7

Data Location:

  • Metadata: /opt/agent-governance/memory/memory.db
  • Chunks: /opt/agent-governance/memory/chunks/

Integration Patterns

Pattern 1: Store Large Output

When a command produces large output:

# Instead of printing 10000 lines to chat
pytest tests/ 2>&1 | memory log --stdin --tag "pytest" --directory ./tests

# Report summary in chat
echo "Tests complete. Results stored in mem-xxx (8500 tokens)."
echo "Summary: 156 passed, 2 failed (test_auth, test_db)"

In STATUS.md, reference stored content:

## Context References

- Full test output: `mem-20260123-100000-abcd` (8500 tokens)
- Build log: `mem-20260123-100001-efgh` (3200 tokens)

Retrieve with: `memory fetch <id> --summary-only`

Pattern 3: Checkpoint with Memory

Checkpoints automatically include memory references:

# Create checkpoint after storing output
memory log --file build.log --tag "build"
checkpoint now --notes "Build complete"

# Later, checkpoint report shows memory refs
checkpoint report
# [MEMORY]
#   3 entries, 12000 total tokens

Pattern 4: Recovery Workflow

After context reset:

# 1. Load checkpoint
checkpoint load
# Shows: phase, deps, status summary, memory refs

# 2. Check directory status
checkpoint report
# Shows: active directories, completion %

# 3. Get memory summaries
memory list --limit 5
memory fetch mem-xxx -s  # Summary only

# 4. Fetch specific chunk if needed
memory fetch mem-xxx --chunk 2

# 5. Resume work
status update ./current-dir --task "Resuming: fix auth bug"

Token Efficiency

Before Memory Layer

Token Window:
├── System prompt: 2000 tokens
├── Conversation: 5000 tokens
├── Large output: 15000 tokens  ← Problem!
└── Total: 22000 tokens (overflow risk)

With Memory Layer

Token Window:
├── System prompt: 2000 tokens
├── Conversation: 5000 tokens
├── Memory reference: 50 tokens  ← "mem-xxx: test results (15000 tokens)"
├── Summary: 200 tokens          ← "156 passed, 2 failed..."
└── Total: 7250 tokens (safe)

External Memory:
└── Full content: 15000 tokens (retrievable on demand)

Token Budget Guidelines

Content Recommended Action
< 500 tokens Include inline
500-2000 tokens Consider storing, include summary
> 2000 tokens Always store, use summary + refs
> 10000 tokens Store chunked, fetch chunks as needed

File Locations

/opt/agent-governance/
├── checkpoint/
│   ├── checkpoint.py          # Checkpoint manager
│   ├── storage/               # Checkpoint JSON files
│   └── templates/             # Summary templates
├── memory/
│   ├── memory.py              # Memory manager
│   ├── memory.db              # SQLite metadata
│   ├── chunks/                # Compressed content files
│   └── summaries/             # Generated summaries
├── bin/
│   ├── checkpoint             # CLI wrapper
│   ├── status                 # CLI wrapper
│   └── memory                 # CLI wrapper
├── docs/
│   ├── CONTEXT_MANAGEMENT.md  # This file
│   ├── MEMORY_LAYER.md        # Memory details
│   └── STATUS_PROTOCOL.md     # Status details
└── */
    ├── README.md              # Directory overview
    └── STATUS.md              # Directory status

Best Practices

For Checkpoints

  1. Create checkpoints at natural stopping points
  2. Include descriptive notes
  3. Use checkpoint report to verify state
  4. Review timeline before resuming work

For STATUS Files

  1. Update status when entering/leaving directories
  2. Keep task lists current
  3. Document blockers immediately
  4. Link to relevant memory entries

For Memory

  1. Store proactively (before context fills)
  2. Use meaningful tags for searchability
  3. Fetch summaries first, full content only when needed
  4. Prune regularly to manage storage

For Recovery

  1. Always start with checkpoint load
  2. Use checkpoint report for overview
  3. Check status dashboard for active work
  4. Fetch memory summaries before full content

Troubleshooting

"Checkpoint not found"

checkpoint list  # See available checkpoints

"Memory entry not found"

memory list --limit 50  # Browse recent entries
memory search "keyword"  # Search by content

"STATUS.md missing"

status sweep --fix  # Auto-create missing files

"Token window overflow"

# Store large content externally
memory log --stdin < large-output.txt
# Use summary instead of full content
memory fetch mem-xxx -s

Part of the Agent Governance System