agent-governance/README.md

# Agent Governance System

> A comprehensive framework for governing AI agent execution with security, auditability, and coordination.

## Overview

The Agent Governance System provides infrastructure for running AI agents with:
- **Tiered permissions** (T0 observer, T1 executor, T2 admin)
- **Audit trails** via SQLite ledger
- **Secure credentials** via HashiCorp Vault
- **State coordination** via DragonflyDB
- **Pipeline orchestration** for multi-agent workflows
- **Context management** for long-running sessions

## Quick Start

```bash
# Check system status
checkpoint load                    # Load session state
status dashboard                   # View directory progress
memory stats                       # Check memory usage

# Create checkpoint after work
checkpoint now --notes "Description of completed work"
```

## Key Components

| Directory | Purpose | Status |
|-----------|---------|--------|
| `pipeline/` | Pipeline DSL and core definitions | ✅ Complete |
| `runtime/` | Agent lifecycle and governance | ✅ Complete |
| `checkpoint/` | Session state management | ✅ Complete |
| `memory/` | External memory layer | ✅ Complete |
| `teams/` | Hierarchical team framework | ✅ Complete |
| `analytics/` | Learning and pattern detection | ✅ Complete |
| `tests/` | Test suites including chaos tests | 🚧 In Progress |

## CLI Tools

### Context Management

```bash
# Checkpoints - session state snapshots
checkpoint now --notes "..."       # Create checkpoint
checkpoint load                    # Load latest
checkpoint report                  # Combined status view
checkpoint timeline               # History

# Status - per-directory tracking
status sweep                       # Check all directories
status update <dir> --phase <p>    # Update status
status dashboard                   # Overview

# Memory - large content storage
memory log --stdin                 # Store from pipe
memory fetch <id> -s              # Get summary
memory list                        # Browse entries
```

### Agent Operations

```bash
# Run chaos tests
python tests/multi-agent-chaos/orchestrator.py

# Validate pipelines
python pipeline/pipeline.py validate <file.yaml>
```

## Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                     Agent Governance                         │
├──────────────┬──────────────┬──────────────┬───────────────┤
│   Agents     │   Pipeline   │   Runtime    │   Context     │
│              │              │              │               │
│ • T0 Observer│ • DSL Parser │ • Lifecycle  │ • Checkpoints │
│ • T1 Executor│ • Stages     │ • Governance │ • STATUS      │
│ • T2 Admin   │ • Templates  │ • Revocation │ • Memory      │
├──────────────┴──────────────┴──────────────┴───────────────┤
│                    Infrastructure                            │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌────────────┐  │
│  │  Vault   │  │ Dragonfly│  │  Ledger  │  │  Evidence  │  │
│  │ (secrets)│  │  (state) │  │  (audit) │  │ (artifacts)│  │
│  └──────────┘  └──────────┘  └──────────┘  └────────────┘  │
└─────────────────────────────────────────────────────────────┘
```

## Documentation

| Document | Description |
|----------|-------------|
| [ARCHITECTURE.md](docs/ARCHITECTURE.md) | Full system design |
| [CONTEXT_MANAGEMENT.md](docs/CONTEXT_MANAGEMENT.md) | Checkpoints, STATUS, Memory |
| [MEMORY_LAYER.md](docs/MEMORY_LAYER.md) | External memory details |
| [STATUS_PROTOCOL.md](docs/STATUS_PROTOCOL.md) | Directory status protocol |

## Directory Structure

```
agent-governance/
├── agents/           # Agent implementations (T0, T1, T2)
├── analytics/        # Learning and pattern detection
├── bin/              # CLI tools (checkpoint, status, memory)
├── checkpoint/       # Session state management
├── docs/             # Documentation
├── evidence/         # Audit evidence packages
├── integrations/     # External integrations (GitHub, Slack)
├── ledger/           # SQLite audit ledger
├── memory/           # External memory layer
├── orchestrator/     # Multi-agent orchestration
├── pipeline/         # Pipeline DSL and templates
├── preflight/        # Pre-execution validation
├── runtime/          # Agent lifecycle governance
├── sandbox/          # Sandboxed execution (Terraform, Ansible)
├── schemas/          # JSON schemas
├── teams/            # Hierarchical team framework
├── tests/            # Test suites
└── wrappers/         # Tool wrappers
```

## Current Status

```
Progress: ███████░░░░░░░░░░░░░░░░░░░░░░░ 23%

✅ Complete:       14 directories
🚧 In Progress:     5 directories
```

Run `status dashboard` for current details.

## Recovery After Reset

```bash
# 1. Load checkpoint
checkpoint load

# 2. View combined status
checkpoint report

# 3. Check memory
memory list --limit 5

# 4. Resume work
status update ./target-dir --task "Resuming work"
```

## Dependencies

| Service | Purpose | Port |
|---------|---------|------|
| HashiCorp Vault | Secrets management | 8200 |
| DragonflyDB | State coordination | 6379 |
| SQLite | Audit ledger | File |

---

*Phase 8: Production Hardening - In Progress*

**Completed Phases:** 1-7 ✅ | Foundation, Vault, Pipeline, Promotion/Revocation, Agent Bootstrap, DSL/Templates/Testing, Teams/Learning