Orchestrator changes:
- Force-spawn GAMMA on iteration_limit before abort
- GAMMA.synthesize() creates emergency handoff payload
- loadRecoveryContext() logs "Resuming from {task_id} handoff"
- POST to /api/pipeline/log for resume message visibility
AgentGamma changes:
- Add synthesize() method for emergency abort synthesis
- Merges existing proposals into coherent handoff
- Stores as synthesis_type: "abort_recovery"
Server changes:
- Add POST /api/pipeline/log endpoint for orchestrator logging
- Recovery pipeline properly inherits GAMMA synthesis
Test coverage:
- test_auto_recovery.py: 6 unit tests
- test_e2e_auto_recovery.py: 5 E2E tests
- test_supervisor_recovery.py: 3 supervisor tests
- Success on attempt 2 (recovery works)
- Max failures (3 retries then FAILED)
- Success on attempt 1 (no recovery needed)
Recovery flow:
1. iteration_limit triggers
2. GAMMA force-spawned for emergency synthesis
3. Handoff dumped with GAMMA synthesis
4. Exit code 3 triggers auto-recovery
5. Recovery pipeline loads handoff
6. Logs "Resuming from {prior_pipeline} handoff"
7. Repeat up to 3 times or until success
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Agent Governance System
Production-grade framework for governing AI agent execution with multi-agent orchestration, Vault-backed security, real-time observability, and consensus-driven workflows.
Status: Phase 12 COMPLETE | Tests: 295/295 passing | Coverage: All 12 phases validated
Quick Start
# Check system health
checkpoint load # Load session state
checkpoint report # View combined status
validate-phases --verbose # Run full validation (295 tests)
# Run the orchestration dashboard
cd /opt/agent-governance/ui && bun run server.ts
# Dashboard: http://localhost:3000
# Bug tracking
bugs list --status open # View open bugs
bugs log -m "Description" --severity high # Log new bug
# Pipeline operations
pipeline spawn --plan <plan_id> --tier 1 # Spawn pipeline agents
Architecture Overview
┌─────────────────────────────────────────────────────────────────────────────────┐
│ GOVERNANCE LAYER │
│ ┌──────────────────┐ ┌───────────────────┐ ┌─────────────────────────────┐ │
│ │ HashiCorp Vault │ │ DragonflyDB │ │ SQLite Ledger │ │
│ │ │ │ │ │ │ │
│ │ - Per-pipeline │ │ - Blackboard │ │ - agent_actions │ │
│ │ token mgmt │ │ - Metrics │ │ - agent_metrics │ │
│ │ - 2hr TTL + │ │ - Consensus │ │ - violations │ │
│ │ auto-renewal │ │ - Message bus │ │ - promotions │ │
│ │ - Observability │ │ - Error budgets │ │ - tenants/projects │ │
│ │ revocation │ │ - WebSocket pub │ │ - marketplace │ │
│ └──────────────────┘ └───────────────────┘ └─────────────────────────────┘ │
├─────────────────────────────────────────────────────────────────────────────────┤
│ ORCHESTRATION LAYER │
│ ┌─────────────────────────────────────────────────────────────────────────┐ │
│ │ Multi-Agent Pipeline │ │
│ │ │ │
│ │ SPAWN ──► RUNNING ──► REPORT ──► ORCHESTRATING ──► COMPLETED │ │
│ │ │ │ │ │ │ │ │
│ │ Issue Agent Report ALPHA+BETA Consensus │ │
│ │ Vault Status Ready Parallel Achieved │ │
│ │ Token Updates │ │ │
│ │ Error/Stuck? │ │
│ │ │ YES │ │
│ │ SPAWN GAMMA │ │
│ │ (Mediator) │ │
│ └─────────────────────────────────────────────────────────────────────────┘ │
├─────────────────────────────────────────────────────────────────────────────────┤
│ AGENT LAYER │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ ┌─────────────────┐ │
│ │ Agent ALPHA │ │ Agent BETA │ │ Agent GAMMA │ │ Governed LLM │ │
│ │ (Research) │◄─┼─► (Synthesis) │◄─┼─► (Mediator) │ │ (T0/T1/T2) │ │
│ │ │ │ │ │ │ │ │ │
│ │ Parallel │ │ Direct │ │ Spawned on: │ │ - llm-planner │ │
│ │ Execution │ │ Messages │ │ - Stuck 30s │ │ - tier0-agent │ │
│ │ │ │ │ │ - Conflict 3 │ │ - tier1-agent │ │
│ │ │ │ │ │ - Complex .8 │ │ │ │
│ └───────┬───────┘ └───────┬───────┘ └───────────────┘ └─────────────────┘ │
│ └──────────────────┴──────────────────────────────────────────────────│
│ │ │
│ ┌──────────▼──────────┐ │
│ │ Blackboard │ │
│ │ - problem │ │
│ │ - solutions[] │ │
│ │ - progress │ │
│ │ - consensus │ │
│ └─────────────────────┘ │
├─────────────────────────────────────────────────────────────────────────────────┤
│ UI / API LAYER │
│ ┌─────────────────────────────────────────────────────────────────────────┐ │
│ │ Orchestration Dashboard (Bun + WebSocket) │ │
│ │ - Real-time pipeline status - Agent lifecycle cards │ │
│ │ - Consensus failure alerts - Fallback action buttons │ │
│ │ - Log streaming - Metrics display │ │
│ └─────────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────────┘
Core Components
| Component | Purpose | Status |
|---|---|---|
| agents/ | ALPHA/BETA/GAMMA multi-agent + T0/T1/T2 governed agents | Complete |
| ui/ | Orchestration dashboard with WebSocket real-time updates | Complete |
| pipeline/ | Pipeline DSL, templates, and execution engine | Complete |
| orchestrator/ | Multi-agent coordination with consensus tracking | Complete |
| observability/ | Prometheus metrics, distributed tracing, structured logging | Complete |
| marketplace/ | Agent template registry with FTS5 search | Complete |
| checkpoint/ | Session state management and recovery | Complete |
| ledger/ | SQLite audit trail with multi-tenant support | Complete |
| testing/ | 295 tests across 12 phases + chaos testing | Complete |
Key Workflows
Multi-Agent Pipeline
- Spawn: Pipeline created with objective, issues Vault token (2hr TTL, auto-renew)
- Running: ALPHA (research) and BETA (synthesis) agents work in parallel
- Orchestrating: Agents communicate via blackboard + direct messages
- Consensus: Proposals evaluated, votes counted, conflicts resolved
- GAMMA Spawn: If stuck >30s, conflicts >3, or complexity >0.8
- Completion: Final consensus achieved or fallback action taken
Consensus Failure Handling
When agents fail to reach consensus:
- Rerun Same: Spawn fresh ALPHA/BETA with failure context
- Rerun with GAMMA: Force mediator agent for conflict resolution
- Escalate Tier: Increase agent permissions and retry
- Accept Partial: Complete with best available proposal
- Download Log: Export full context for manual review
Vault Token Lifecycle
Pipeline Start
│
▼
┌─────────────────────────────────────┐
│ 1. Request Token (AppRole) │
│ TTL: 2 hours, renewable │
│ Policy: pipeline-agent │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 2. Store in Redis (encrypted) │
│ Key: pipeline:{id}:vault_token │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 3. Pass to ALPHA, BETA, GAMMA │
│ Auto-renewal every 30 min │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 4. Observability monitors usage │
│ Revoke on policy violation │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 5. Revoke on completion/error │
└─────────────────────────────────────┘
CLI Tools
Context Management
# Checkpoints - session state snapshots
checkpoint now --notes "..." # Create checkpoint
checkpoint load # Load latest
checkpoint report # Combined status view
checkpoint timeline # History
# Status - per-directory tracking
status sweep # Check all directories
status update <dir> --phase <p> # Update status
status dashboard # Overview
# Memory - large content storage
memory log --stdin # Store from pipe
memory fetch <id> -s # Get summary
memory list # Browse entries
Bug Tracking
bugs list # List all bugs
bugs list --status open # Filter by status
bugs list --severity high # Filter by severity
bugs log -m "Description" # Log new bug
bugs update <id> resolved # Update status
bugs get <id> # Get details
bugs scan # Scan for anomalies
bugs status # Summary view
Pipeline Operations
# Validation
validate-phases --verbose # Full 12-phase validation
# Pipeline management (via dashboard API)
curl -X POST localhost:3000/api/spawn \
-d '{"plan_id":"...", "tier":1}'
# Consensus handling
curl localhost:3000/api/pipeline/consensus/status?pipeline_id=...
curl -X POST localhost:3000/api/pipeline/consensus/fallback \
-d '{"pipeline_id":"...", "action":"rerun_gamma"}'
Phase Completion Status
| Phase | Name | Tests | Status |
|---|---|---|---|
| 1 | Foundation | 12/12 | Complete |
| 2 | Secrets Management | 14/14 | Complete |
| 3 | Agent Execution | 19/19 | Complete |
| 4 | Promotion & Revocation | 16/16 | Complete |
| 5 | Bootstrap & Checkpointing | 22/22 | Complete |
| 6 | Multi-Agent Orchestration | 56/56 | Complete |
| 7 | Monitoring & Learning | 46/46 | Complete |
| 8 | Production Hardening | 31/31 | Complete |
| 9 | External Integrations | - | Framework retained, external deprecated |
| 10 | Multi-Tenant Support | 18/18 | Complete |
| 11 | Agent Marketplace | 16/16 | Complete |
| 12 | Observability | 21/21 | Complete |
| Total | 295/295 | Complete |
Dependencies
| Service | Purpose | Endpoint |
|---|---|---|
| HashiCorp Vault | Secrets, token management | https://127.0.0.1:8200 |
| DragonflyDB | State, metrics, pub/sub | redis://127.0.0.1:6379 |
| SQLite | Audit ledger, marketplace | File-based |
| Bun | TypeScript runtime | Local |
| OpenRouter | LLM API gateway | External |
Directory Structure
agent-governance/
├── agents/ # Agent implementations
│ ├── multi-agent/ # ALPHA/BETA/GAMMA orchestrator
│ ├── llm-planner/ # Python LLM agent
│ ├── llm-planner-ts/ # TypeScript LLM agent
│ ├── tier0-agent/ # Observer tier (read-only)
│ └── tier1-agent/ # Executor tier (write)
├── bin/ # CLI tools
├── checkpoint/ # Session state management
├── docs/ # Documentation
├── evidence/ # Audit evidence packages
├── integrations/ # Integration framework
├── ledger/ # SQLite audit ledger + API
├── marketplace/ # Agent template registry
├── memory/ # External memory layer
├── observability/ # Metrics, tracing, logging
├── orchestrator/ # Pipeline orchestration
├── pipeline/ # Pipeline DSL and templates
├── preflight/ # Pre-execution validation
├── sandbox/ # Terraform/Ansible sandbox
├── testing/ # Test framework + oversight
├── tests/ # Test suites (295 tests)
└── ui/ # Orchestration dashboard
Documentation
Architecture & Design
| Document | Description |
|---|---|
| ARCHITECTURE.md | Full system design |
| MULTI_AGENT_PIPELINE_ARCHITECTURE.md | Pipeline flow, Vault tokens, agent lifecycle |
| PHASE_DEPENDENCY_ANALYSIS.md | Phase dependencies and order |
Implementation & Operations
| Document | Description |
|---|---|
| PRODUCTION_PIPELINE.md | Implementation plan and production workflows |
| ENGINEERING_GUIDE.md | Runtime governance spec and quick reference |
| CREDENTIALS_SETUP.md | Vault and DragonflyDB setup |
Context & Memory
| Document | Description |
|---|---|
| CONTEXT_MANAGEMENT.md | Checkpoints, STATUS, Memory |
| STATUS_PROTOCOL.md | Directory status protocol |
| MEMORY_LAYER.md | External memory layer details |
Agent Documentation
| Document | Description |
|---|---|
| agents/README.md | Agent foundation and tier system |
| tier0-guide.md | Tier 0 agent guide |
External References
| Resource | Description |
|---|---|
| HashiCorp Vault | Secrets management documentation |
| Bun Runtime | TypeScript runtime documentation |
| DragonflyDB | Redis-compatible database docs |
Production Constraints
Token Revocation Triggers
| Condition | Threshold | Action |
|---|---|---|
| Error rate | > 5 errors/minute | Revoke + spawn diagnostic |
| Stuck agent | > 60 seconds no progress | Revoke agent token only |
| Policy violation | Any CRITICAL | Immediate full revocation |
| Resource abuse | > 100 API calls/minute | Rate limit, then revoke |
Consensus Requirements
- Pipelines remain in
ORCHESTRATINGuntil consensus achieved - Exit code 0 = success, 1 = error, 2 = consensus failure
- Failure context recorded to DragonflyDB for retry attempts
- User must explicitly accept partial output to complete without consensus
Recovery After Reset
# 1. Load checkpoint
checkpoint load
# 2. View combined status
checkpoint report
# 3. Check active bugs
bugs list --status open
# 4. Resume pipeline if needed
curl localhost:3000/api/pipeline/consensus/status?pipeline_id=...
API Endpoints
Pipeline Control
| Endpoint | Method | Description |
|---|---|---|
/api/spawn |
POST | Spawn pipeline with plan |
/api/pipeline/continue |
POST | Trigger orchestration |
/api/pipeline/orchestration |
GET | Get orchestration status |
/api/pipeline/token |
GET | Get pipeline token status |
/api/pipeline/revoke |
POST | Revoke pipeline token |
/api/active-pipelines |
GET | List active pipelines |
/api/pipeline/consensus/status |
GET | Consensus status |
/api/pipeline/consensus/fallback |
POST | Execute fallback action |
Observability
| Endpoint | Method | Description |
|---|---|---|
/api/observability/errors |
GET | Error summary |
/api/observability/handoff |
POST | Generate handoff report |
Phase 12: Observability - COMPLETE
All 12 phases validated | 295/295 tests passing | Last updated: 2026-01-24
Description
Languages
Python
70.8%
TypeScript
25.5%
Shell
3.6%
HCL
0.1%