Phase 8 Production Hardening with complete governance infrastructure: - Vault integration with tiered policies (T0-T4) - DragonflyDB state management - SQLite audit ledger - Pipeline DSL and templates - Promotion/revocation engine - Checkpoint system for session persistence - Health manager and circuit breaker for fault tolerance - GitHub/Slack integrations - Architectural test pipeline with bug watcher, suggestion engine, council review - Multi-agent chaos testing framework Test Results: - Governance tests: 68/68 passing - E2E workflow: 16/16 passing - Phase 2 Vault: 14/14 passing - Integration tests: 27/27 passing Coverage: 57.6% average across 12 phases Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
741 lines
26 KiB
Markdown
741 lines
26 KiB
Markdown
# AI Agent Governance System Architecture
|
|
|
|
**Version:** 0.2.0
|
|
**Status:** Active Development
|
|
**Last Updated:** 2026-01-23
|
|
|
|
---
|
|
|
|
## Table of Contents
|
|
|
|
1. [Executive Summary](#executive-summary)
|
|
2. [System Architecture](#system-architecture)
|
|
3. [Core Components](#core-components)
|
|
4. [Agent Taxonomy](#agent-taxonomy)
|
|
5. [Runtime Governance](#runtime-governance)
|
|
6. [Multi-Agent Coordination](#multi-agent-coordination)
|
|
7. [Current Capabilities](#current-capabilities)
|
|
8. [Engineering Focus Areas](#engineering-focus-areas)
|
|
9. [Sample Implementations](#sample-implementations)
|
|
10. [Future Potential](#future-potential)
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
This system implements a **governed AI agent framework** designed for safe, auditable, and scalable automation. The architecture enforces:
|
|
|
|
- **Trust-tiered access control** via HashiCorp Vault
|
|
- **Real-time governance** via DragonflyDB
|
|
- **Structured agent lifecycles** with mandatory phases
|
|
- **Multi-agent coordination** with parallel execution and conditional spawning
|
|
- **Complete audit trails** via SQLite ledger
|
|
|
|
The system prioritizes **legibility over magic** — every agent action must be explainable, reproducible, and auditable.
|
|
|
|
---
|
|
|
|
## System Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
│ GOVERNANCE LAYER │
|
|
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────────┐ │
|
|
│ │ HashiCorp │ │ DragonflyDB │ │ SQLite Ledger │ │
|
|
│ │ Vault │ │ (Runtime) │ │ (Audit) │ │
|
|
│ │ │ │ │ │ │ │
|
|
│ │ • Policies │ │ • State │ │ • agent_actions │ │
|
|
│ │ • Secrets │ │ • Locks │ │ • agent_metrics │ │
|
|
│ │ • AppRole Auth │ │ • Heartbeats │ │ • violations │ │
|
|
│ │ • Token Leases │ │ • Errors │ │ • promotions │ │
|
|
│ └─────────────────┘ │ • Blackboard │ └─────────────────────────────┘ │
|
|
│ │ • Messages │ │
|
|
│ └─────────────────┘ │
|
|
├─────────────────────────────────────────────────────────────────────────────┤
|
|
│ ORCHESTRATION LAYER │
|
|
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
|
│ │ Multi-Agent Orchestrator │ │
|
|
│ │ • Parallel agent execution • Spawn condition monitoring │ │
|
|
│ │ • Performance metrics • Consensus coordination │ │
|
|
│ └─────────────────────────────────────────────────────────────────────┘ │
|
|
├─────────────────────────────────────────────────────────────────────────────┤
|
|
│ AGENT LAYER │
|
|
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ ┌─────────────┐ │
|
|
│ │ Agent ALPHA │ │ Agent BETA │ │ Agent GAMMA │ │ Governed │ │
|
|
│ │ (Research) │ │ (Synthesis) │ │ (Mediator) │ │ LLM Agent │ │
|
|
│ │ │ │ │ │ │ │ │ │
|
|
│ │ Parallel │◄─┼─► Direct ─┼──┼─► Spawned │ │ Single │ │
|
|
│ │ Execution │ │ Messages │ │ on │ │ Pipeline │ │
|
|
│ └───────┬───────┘ └───────┬───────┘ │ Condition │ └──────┬──────┘ │
|
|
│ │ │ └───────────────┘ │ │
|
|
│ └──────────┬───────┴────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ ┌──────▼──────┐ │
|
|
│ │ Blackboard │ (Shared Memory) │
|
|
│ │ • problem │ │
|
|
│ │ • solutions│ │
|
|
│ │ • progress │ │
|
|
│ │ • consensus│ │
|
|
│ └─────────────┘ │
|
|
├─────────────────────────────────────────────────────────────────────────────┤
|
|
│ INFRASTRUCTURE LAYER │
|
|
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────────┐ │
|
|
│ │ OpenRouter │ │ Bun Runtime │ │ WireGuard VPN │ │
|
|
│ │ (LLM API) │ │ (TypeScript) │ │ (Network) │ │
|
|
│ └─────────────────┘ └─────────────────┘ └─────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Core Components
|
|
|
|
### 1. HashiCorp Vault (Policy Engine)
|
|
|
|
**Purpose:** Centralized secrets management and trust-tier enforcement.
|
|
|
|
```
|
|
Location: https://127.0.0.1:8200
|
|
Storage: /opt/vault/data
|
|
Policies: /opt/vault/policies/t{0-4}-*.hcl
|
|
```
|
|
|
|
**Key Features:**
|
|
- AppRole authentication for agents
|
|
- Dynamic secret generation
|
|
- Token TTLs based on trust tier
|
|
- Immediate revocation capabilities
|
|
|
|
### 2. DragonflyDB (Runtime State)
|
|
|
|
**Purpose:** Real-time agent state, coordination, and governance signals.
|
|
|
|
```
|
|
Location: redis://127.0.0.1:6379
|
|
Credentials: vault:secret/data/services/dragonfly
|
|
```
|
|
|
|
**Keyspace Design:**
|
|
```
|
|
agent:{id}:packet → Instruction packet (JSON)
|
|
agent:{id}:state → Runtime state (JSON)
|
|
agent:{id}:errors → Error counters (Hash)
|
|
agent:{id}:heartbeat → Last seen (String + TTL)
|
|
agent:{id}:lock → Execution lock (String + TTL)
|
|
task:{id}:active_agent → Current agent
|
|
task:{id}:artifacts → Artifact references (List)
|
|
blackboard:{task}:* → Shared memory sections
|
|
msg:{task}:* → Direct message channels
|
|
revocations:ledger → Revocation history (List)
|
|
handoff:{task}:latest → Handoff objects (JSON)
|
|
```
|
|
|
|
### 3. SQLite Ledger (Audit Trail)
|
|
|
|
**Purpose:** Immutable record of all agent actions for compliance and replay.
|
|
|
|
```
|
|
Location: /opt/agent-governance/ledger/governance.db
|
|
```
|
|
|
|
**Schema:**
|
|
```sql
|
|
CREATE TABLE agent_actions (
|
|
id INTEGER PRIMARY KEY,
|
|
timestamp TEXT,
|
|
agent_id TEXT,
|
|
agent_version TEXT,
|
|
tier INTEGER,
|
|
action TEXT,
|
|
decision TEXT,
|
|
confidence REAL,
|
|
success INTEGER,
|
|
error_type TEXT,
|
|
error_message TEXT
|
|
);
|
|
|
|
CREATE TABLE agent_metrics (
|
|
agent_id TEXT PRIMARY KEY,
|
|
current_tier INTEGER,
|
|
total_runs INTEGER,
|
|
compliant_runs INTEGER,
|
|
consecutive_compliant INTEGER,
|
|
last_active_at TEXT
|
|
);
|
|
```
|
|
|
|
---
|
|
|
|
## Agent Taxonomy
|
|
|
|
### Trust Tiers
|
|
|
|
| Tier | Name | Capabilities | Token TTL |
|
|
|------|------|-------------|-----------|
|
|
| 0 | Observer | Read docs, inventory, logs; Generate plans | 1h |
|
|
| 1 | Operator | Sandbox SSH, basic Proxmox, Ansible check-mode | 30m |
|
|
| 2 | Builder | Sandbox admin, create frameworks/modules | 30m |
|
|
| 3 | Executor | Staging access, limited prod read, root-controlled | 15m |
|
|
| 4 | Architect | Policy read, governance write, requires approval | 15m |
|
|
|
|
### Agent Lifecycle Phases
|
|
|
|
```
|
|
BOOTSTRAP → PREFLIGHT → PLAN → EXECUTE → VERIFY → PACKAGE → REPORT → EXIT
|
|
│ │ │ │ │ │ │ │
|
|
│ │ │ │ │ │ │ └─ Release lock
|
|
│ │ │ │ │ │ └─ Generate report
|
|
│ │ │ │ │ └─ Collect artifacts
|
|
│ │ │ │ └─ Verify results
|
|
│ │ │ └─ Execute plan (if approved)
|
|
│ │ └─ Generate plan artifact
|
|
│ └─ Scope/dependency checks
|
|
└─ Read revocations, load packet, acquire lock
|
|
```
|
|
|
|
### Error Budget System
|
|
|
|
```json
|
|
{
|
|
"max_total_errors": 8,
|
|
"max_same_error_repeats": 2,
|
|
"max_procedure_violations": 1
|
|
}
|
|
```
|
|
|
|
**Automatic Revocation Triggers:**
|
|
- `procedure_violations >= 1`
|
|
- `same_error >= max_same_error_repeats`
|
|
- `total_errors >= max_total_errors`
|
|
- Missing required artifact after EXECUTE
|
|
- Forbidden action detected
|
|
|
|
---
|
|
|
|
## Runtime Governance
|
|
|
|
### Instruction Packets
|
|
|
|
Every agent receives a structured instruction packet defining its mission:
|
|
|
|
```typescript
|
|
interface InstructionPacket {
|
|
agent_id: string;
|
|
task_id: string;
|
|
created_for: string;
|
|
objective: string;
|
|
deliverables: string[];
|
|
constraints: {
|
|
scope: string[];
|
|
forbidden: string[];
|
|
required_steps: string[];
|
|
};
|
|
success_criteria: string[];
|
|
error_budget: ErrorBudget;
|
|
escalation_rules: string[];
|
|
created_at: string;
|
|
}
|
|
```
|
|
|
|
### Handoff Objects
|
|
|
|
When an agent is revoked, it must create a handoff for the next agent:
|
|
|
|
```typescript
|
|
interface HandoffObject {
|
|
task_id: string;
|
|
previous_agent_id: string;
|
|
revoked: boolean;
|
|
revocation_reason: { type: string; details: string };
|
|
last_known_state: { phase: string; step: string };
|
|
what_was_tried: string[];
|
|
blocking_issue: string;
|
|
required_next_actions: string[];
|
|
constraints_reminder: string[];
|
|
artifacts: string[];
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Multi-Agent Coordination
|
|
|
|
### Communication Patterns
|
|
|
|
**1. Direct Messaging (Point-to-Point)**
|
|
```
|
|
Agent ALPHA ──PROPOSAL──► Agent BETA
|
|
Agent BETA ──FEEDBACK──► Agent ALPHA
|
|
Agent GAMMA ──HANDOFF───► ALL
|
|
```
|
|
|
|
**2. Blackboard (Shared Memory)**
|
|
```
|
|
┌─────────────────────────────────────────┐
|
|
│ BLACKBOARD │
|
|
├──────────┬──────────┬──────────┬────────┤
|
|
│ problem │ solutions│ progress │consensus│
|
|
├──────────┼──────────┼──────────┼────────┤
|
|
│objective │proposal_1│eval_1 │votes │
|
|
│analysis │proposal_2│eval_2 │final │
|
|
│constraints│synthesis │gamma_res │ │
|
|
└──────────┴──────────┴──────────┴────────┘
|
|
```
|
|
|
|
### Conditional Agent Spawning
|
|
|
|
Agent GAMMA is spawned when thresholds are exceeded:
|
|
|
|
| Condition | Threshold | Description |
|
|
|-----------|-----------|-------------|
|
|
| STUCK | 30s | Agents inactive for 30+ seconds |
|
|
| CONFLICT | 3 | 3+ unresolved proposal conflicts |
|
|
| COMPLEXITY | 0.8 | Task complexity score > 0.8 |
|
|
| SUCCESS | 1.0 | Task complete, validation needed |
|
|
|
|
### Consensus Mechanism
|
|
|
|
```typescript
|
|
interface ConsensusVote {
|
|
agent: AgentRole;
|
|
proposal_id: string;
|
|
vote: "ACCEPT" | "REJECT" | "ABSTAIN";
|
|
reasoning: string;
|
|
timestamp: string;
|
|
}
|
|
|
|
// Consensus requires:
|
|
// 1. All required agents have voted
|
|
// 2. Accept votes > Reject votes
|
|
// 3. No rejects from required agents
|
|
```
|
|
|
|
---
|
|
|
|
## Current Capabilities
|
|
|
|
### Implemented Features
|
|
|
|
| Feature | Status | Location |
|
|
|---------|--------|----------|
|
|
| Vault policy engine | ✅ Complete | `/opt/vault/policies/` |
|
|
| Trust tier system (T0-T4) | ✅ Complete | Vault policies |
|
|
| DragonflyDB runtime | ✅ Complete | `runtime/governance.py` |
|
|
| SQLite audit ledger | ✅ Complete | `ledger/governance.db` |
|
|
| Single-agent pipeline | ✅ Complete | `llm-planner-ts/governed-agent.ts` |
|
|
| Multi-agent parallel execution | ✅ Complete | `multi-agent/orchestrator.ts` |
|
|
| Blackboard shared memory | ✅ Complete | `multi-agent/coordination.ts` |
|
|
| Direct messaging | ✅ Complete | `multi-agent/coordination.ts` |
|
|
| Conditional spawning | ✅ Complete | `multi-agent/orchestrator.ts` |
|
|
| Performance metrics | ✅ Complete | `multi-agent/coordination.ts` |
|
|
| Error budget tracking | ✅ Complete | `GovernanceManager` class |
|
|
| Revocation handling | ✅ Complete | `GovernanceManager` class |
|
|
|
|
### Performance Benchmarks
|
|
|
|
| Metric | Single Agent | Multi-Agent (3) |
|
|
|--------|-------------|-----------------|
|
|
| Avg. task duration | 60-120s | 45-90s |
|
|
| Messages per task | N/A | 20-30 |
|
|
| Blackboard ops | 5-10 | 40-60 |
|
|
| LLM calls | 2-4 | 6-12 |
|
|
|
|
---
|
|
|
|
## Engineering Focus Areas
|
|
|
|
### 1. Pipeline Programming
|
|
|
|
**Goal:** Create composable, reusable agent pipelines.
|
|
|
|
**Current Work:**
|
|
```typescript
|
|
// Pipeline stages as composable functions
|
|
type PipelineStage<T, U> = (input: T, context: GovernanceContext) => Promise<U>;
|
|
|
|
// Example pipeline composition
|
|
const agentPipeline = compose(
|
|
bootstrap,
|
|
preflight,
|
|
plan,
|
|
execute,
|
|
verify,
|
|
package,
|
|
report
|
|
);
|
|
```
|
|
|
|
**Planned Features:**
|
|
- Stage-level error handling and retry
|
|
- Pipeline branching based on conditions
|
|
- Pipeline templates for common patterns
|
|
- Hot-swappable stages for testing
|
|
|
|
### 2. Bun Integration
|
|
|
|
**Goal:** Leverage Bun's performance for agent execution.
|
|
|
|
**Current Implementation:**
|
|
```typescript
|
|
// File: agents/llm-planner-ts/governed-agent.ts
|
|
import { $ } from "bun";
|
|
import { Database } from "bun:sqlite";
|
|
|
|
// Shell commands via Bun
|
|
const result = await $`curl -sk ...`.json();
|
|
|
|
// SQLite via Bun native
|
|
const db = new Database("/opt/agent-governance/ledger/governance.db");
|
|
```
|
|
|
|
**Advantages:**
|
|
- 4x faster startup than Node.js
|
|
- Native TypeScript execution
|
|
- Built-in SQLite support
|
|
- Shell command integration
|
|
- Excellent npm compatibility
|
|
|
|
**Planned Enhancements:**
|
|
- Bun's built-in test runner integration
|
|
- Bun's native WebSocket for real-time coordination
|
|
- Bun's worker threads for parallel LLM calls
|
|
|
|
### 3. Testing Framework
|
|
|
|
**Goal:** Enable long-term iteration with confidence.
|
|
|
|
**Architecture:**
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ TESTING FRAMEWORK │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ Unit Tests │ Integration Tests │ E2E Tests │
|
|
│ ───────────── │ ───────────────── │ ────────── │
|
|
│ • Agent methods │ • Vault + Agent │ • Full task │
|
|
│ • Blackboard ops │ • Redis + Agent │ • Multi-agent│
|
|
│ • Message parsing │ • LLM + Governance │ • Failure │
|
|
│ • Error handling │ • Pipeline stages │ recovery │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ Mock Infrastructure │ Test Scenarios │ Metrics │
|
|
│ ───────────────── │ ────────────── │ ─────── │
|
|
│ • MockVault │ • Happy path │ • Duration │
|
|
│ • MockDragonfly │ • Error budget │ • Coverage │
|
|
│ • MockLLM │ • Revocation │ • Flakiness │
|
|
│ • MockBlackboard │ • Consensus fail │ • Regression │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
**Test Categories:**
|
|
|
|
```typescript
|
|
// 1. Unit Tests - Isolated component testing
|
|
describe("GovernanceManager", () => {
|
|
it("should track error budget correctly", async () => {
|
|
const gov = new MockGovernanceManager();
|
|
await gov.recordError("agent-1", "LLM_ERROR", "timeout");
|
|
const counts = await gov.getErrorCounts("agent-1");
|
|
expect(counts.total_errors).toBe(1);
|
|
});
|
|
});
|
|
|
|
// 2. Integration Tests - Component interaction
|
|
describe("Agent + Vault Integration", () => {
|
|
it("should bootstrap with valid token", async () => {
|
|
const agent = new GovernedAgent("test-agent");
|
|
const [ok, msg] = await agent.bootstrap();
|
|
expect(ok).toBe(true);
|
|
});
|
|
});
|
|
|
|
// 3. Scenario Tests - Full workflow validation
|
|
describe("Multi-Agent Scenarios", () => {
|
|
it("should spawn GAMMA on complexity threshold", async () => {
|
|
const orchestrator = new TestOrchestrator();
|
|
const metrics = await orchestrator.runTask(highComplexityTask);
|
|
expect(metrics.gamma_spawned).toBe(true);
|
|
expect(metrics.gamma_spawn_reason).toBe("COMPLEXITY");
|
|
});
|
|
});
|
|
```
|
|
|
|
**Mock Infrastructure:**
|
|
|
|
```typescript
|
|
// MockLLM for deterministic testing
|
|
class MockLLM {
|
|
private responses: Map<string, string> = new Map();
|
|
|
|
setResponse(pattern: string, response: string) {
|
|
this.responses.set(pattern, response);
|
|
}
|
|
|
|
async complete(prompt: string): Promise<string> {
|
|
for (const [pattern, response] of this.responses) {
|
|
if (prompt.includes(pattern)) return response;
|
|
}
|
|
return '{"confidence": 0.5, "steps": []}';
|
|
}
|
|
}
|
|
|
|
// MockDragonfly for state testing
|
|
class MockDragonfly {
|
|
private store: Map<string, any> = new Map();
|
|
|
|
async set(key: string, value: any) { this.store.set(key, value); }
|
|
async get(key: string) { return this.store.get(key); }
|
|
async hSet(key: string, field: string, value: any) { /* ... */ }
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Sample Implementations
|
|
|
|
### Single Governed Agent (TypeScript/Bun)
|
|
|
|
```typescript
|
|
// File: governed-agent.ts
|
|
import { GovernanceManager } from "./coordination";
|
|
|
|
class GovernedAgent {
|
|
private gov: GovernanceManager;
|
|
private agentId: string;
|
|
|
|
async bootstrap(): Promise<[boolean, string]> {
|
|
this.gov = new GovernanceManager();
|
|
await this.gov.connect();
|
|
|
|
// Read revocation ledger
|
|
const revocations = await this.gov.getRecentRevocations(50);
|
|
for (const rev of revocations) {
|
|
if (rev.agent_id === this.agentId) {
|
|
return [false, "AGENT_PREVIOUSLY_REVOKED"];
|
|
}
|
|
}
|
|
|
|
// Load instruction packet
|
|
const packet = await this.gov.getPacket(this.agentId);
|
|
if (!packet) return [false, "NO_INSTRUCTION_PACKET"];
|
|
|
|
// Acquire lock
|
|
if (!await this.gov.acquireLock(this.agentId)) {
|
|
return [false, "CANNOT_ACQUIRE_LOCK"];
|
|
}
|
|
|
|
return [true, "BOOTSTRAP_COMPLETE"];
|
|
}
|
|
|
|
async transition(phase: string, step: string): Promise<boolean> {
|
|
await this.gov.heartbeat(this.agentId);
|
|
await this.gov.refreshLock(this.agentId);
|
|
|
|
const [ok, reason] = await this.gov.checkErrorBudget(this.agentId);
|
|
if (!ok) {
|
|
await this.gov.revokeAgent(this.agentId, "ERROR_BUDGET_EXCEEDED", reason);
|
|
return false;
|
|
}
|
|
|
|
await this.gov.setState({ phase, step, /* ... */ });
|
|
return true;
|
|
}
|
|
}
|
|
```
|
|
|
|
### Multi-Agent Orchestrator
|
|
|
|
```typescript
|
|
// File: orchestrator.ts
|
|
class MultiAgentOrchestrator {
|
|
private alphaAgent: AgentAlpha;
|
|
private betaAgent: AgentBeta;
|
|
private gammaAgent?: AgentGamma;
|
|
|
|
async runTask(task: TaskDefinition): Promise<Metrics> {
|
|
// Launch ALPHA and BETA in parallel
|
|
const alphaPromise = this.alphaAgent.run(task);
|
|
const betaPromise = this.betaAgent.run(task);
|
|
|
|
// Monitor spawn conditions
|
|
this.monitorInterval = setInterval(() => {
|
|
this.checkSpawnConditions();
|
|
}, 2000);
|
|
|
|
await Promise.all([alphaPromise, betaPromise]);
|
|
|
|
// Spawn GAMMA if needed
|
|
if (this.shouldSpawnGamma()) {
|
|
await this.spawnGamma(spawnReason);
|
|
await this.gammaAgent.run(task);
|
|
}
|
|
|
|
return this.metrics.finalize();
|
|
}
|
|
|
|
private async checkSpawnConditions() {
|
|
// Check stuck, conflict, complexity thresholds
|
|
const stuckAgents = await this.stateManager.detectStuckAgents(30);
|
|
if (stuckAgents.length > 0) {
|
|
await this.spawnController.updateCondition("STUCK", stuckAgents.length);
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Future Potential
|
|
|
|
### Short-Term (Q1 2026)
|
|
|
|
1. **Pipeline DSL**
|
|
```yaml
|
|
pipeline:
|
|
name: infrastructure-deploy
|
|
stages:
|
|
- plan:
|
|
agent: planner
|
|
artifacts: [terraform-plan]
|
|
- review:
|
|
type: human-gate
|
|
timeout: 30m
|
|
- execute:
|
|
agent: executor
|
|
requires: [plan]
|
|
```
|
|
|
|
2. **Agent Templates**
|
|
- Pre-configured agents for common tasks
|
|
- Terraform specialist
|
|
- Ansible specialist
|
|
- Code review specialist
|
|
|
|
3. **Enhanced Testing**
|
|
- Chaos testing for agent resilience
|
|
- Load testing for multi-agent scaling
|
|
- Regression test suite
|
|
|
|
### Medium-Term (Q2-Q3 2026)
|
|
|
|
1. **Hierarchical Agent Teams**
|
|
```
|
|
Team Lead Agent
|
|
├── Research Team (3 agents)
|
|
├── Implementation Team (2 agents)
|
|
└── Review Team (2 agents)
|
|
```
|
|
|
|
2. **Learning from History**
|
|
- Analyze past task completions
|
|
- Suggest optimizations
|
|
- Predict failure patterns
|
|
|
|
3. **External Integrations**
|
|
- GitHub PR automation
|
|
- Slack notifications
|
|
- PagerDuty escalations
|
|
|
|
### Long-Term (2027+)
|
|
|
|
1. **Self-Optimizing Pipelines**
|
|
- Agents propose pipeline improvements
|
|
- A/B testing of agent strategies
|
|
- Automatic tier promotion
|
|
|
|
2. **Cross-System Orchestration**
|
|
- Multiple infrastructure targets
|
|
- Hybrid cloud coordination
|
|
- Edge deployment agents
|
|
|
|
---
|
|
|
|
## File Structure
|
|
|
|
```
|
|
/opt/agent-governance/
|
|
├── docs/
|
|
│ └── ARCHITECTURE.md # This document
|
|
├── ledger/
|
|
│ └── governance.db # SQLite audit trail
|
|
├── runtime/
|
|
│ ├── governance.py # Python governance manager
|
|
│ └── monitors.py # Monitor agents
|
|
├── agents/
|
|
│ ├── llm-planner/
|
|
│ │ ├── agent.py # Python LLM agent
|
|
│ │ └── governed_agent.py # Python governed agent
|
|
│ ├── llm-planner-ts/
|
|
│ │ ├── index.ts # Basic TypeScript agent
|
|
│ │ └── governed-agent.ts # Full governed agent (Bun)
|
|
│ └── multi-agent/
|
|
│ ├── types.ts # Type definitions
|
|
│ ├── coordination.ts # Blackboard, messaging, metrics
|
|
│ ├── agents.ts # Alpha, Beta, Gamma agents
|
|
│ └── orchestrator.ts # Multi-agent orchestrator
|
|
└── /opt/vault/
|
|
├── policies/
|
|
│ ├── t0-observer.hcl
|
|
│ ├── t1-operator.hcl
|
|
│ ├── t2-builder.hcl
|
|
│ ├── t3-executor.hcl
|
|
│ └── t4-architect.hcl
|
|
└── init-keys.json # Vault credentials (chmod 600)
|
|
```
|
|
|
|
---
|
|
|
|
## Running the System
|
|
|
|
### Prerequisites
|
|
|
|
```bash
|
|
# Vault must be running and unsealed
|
|
docker ps | grep vault
|
|
|
|
# DragonflyDB must be running
|
|
docker ps | grep dragonfly
|
|
|
|
# Bun must be installed
|
|
~/.bun/bin/bun --version
|
|
```
|
|
|
|
### Single Agent Test
|
|
|
|
```bash
|
|
cd /opt/agent-governance/agents/llm-planner-ts
|
|
~/.bun/bin/bun run governed-agent.ts \
|
|
"agent-001" \
|
|
"task-001" \
|
|
"Design a microservices architecture"
|
|
```
|
|
|
|
### Multi-Agent Test
|
|
|
|
```bash
|
|
cd /opt/agent-governance/agents/multi-agent
|
|
~/.bun/bin/bun run orchestrator.ts \
|
|
"Design a distributed event-driven analytics platform" \
|
|
--timeout 120
|
|
```
|
|
|
|
---
|
|
|
|
## Contributing
|
|
|
|
When adding new features:
|
|
|
|
1. **Follow the agent lifecycle** - All agents must implement the standard phases
|
|
2. **Log to the ledger** - Every action must be auditable
|
|
3. **Respect error budgets** - Check and track errors properly
|
|
4. **Write tests** - Unit, integration, and scenario tests required
|
|
5. **Document changes** - Update this architecture document
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- [Agent Foundation Document](/root/agents_foundation.md)
|
|
- [Runtime Governance Spec](/root/agent_runtime_governance.md)
|
|
- [Implementation Plan](/root/agent-taxonomy-implementation-plan.md)
|
|
- [Vault Documentation](https://developer.hashicorp.com/vault/docs)
|
|
- [Bun Documentation](https://bun.sh/docs)
|