profit 77655c298c Initial commit: Agent Governance System Phase 8

Phase 8 Production Hardening with complete governance infrastructure:

- Vault integration with tiered policies (T0-T4)
- DragonflyDB state management
- SQLite audit ledger
- Pipeline DSL and templates
- Promotion/revocation engine
- Checkpoint system for session persistence
- Health manager and circuit breaker for fault tolerance
- GitHub/Slack integrations
- Architectural test pipeline with bug watcher, suggestion engine, council review
- Multi-agent chaos testing framework

Test Results:
- Governance tests: 68/68 passing
- E2E workflow: 16/16 passing
- Phase 2 Vault: 14/14 passing
- Integration tests: 27/27 passing

Coverage: 57.6% average across 12 phases

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-23 22:07:06 -05:00

9.5 KiB

Raw Blame History

Engineering Quick Reference Guide

Getting Started

Prerequisites

# Check Vault is running
curl -sk https://127.0.0.1:8200/v1/sys/health | jq .

# Check DragonflyDB is running
redis-cli -p 6379 -a $(cat /opt/vault/init-keys.json | jq -r .root_token) PING

# Check Bun is installed
~/.bun/bin/bun --version  # Should be 1.3.6+

Project Structure

/opt/agent-governance/
├── docs/                    # Documentation
│   ├── ARCHITECTURE.md      # System architecture
│   └── ENGINEERING_GUIDE.md # This file
├── ledger/                  # Audit trail
│   └── governance.db        # SQLite database
├── runtime/                 # Python governance
│   ├── governance.py        # Core governance manager
│   └── monitors.py          # Monitor agents
├── agents/
│   ├── llm-planner/         # Python agents
│   ├── llm-planner-ts/      # TypeScript single-agent
│   └── multi-agent/         # Multi-agent system
└── testing/
    └── framework.ts         # Test framework

Quick Commands

Run Single Agent

cd /opt/agent-governance/agents/llm-planner-ts
~/.bun/bin/bun run governed-agent.ts <agent_id> <task_id> "<objective>"

# Example
~/.bun/bin/bun run governed-agent.ts \
  "agent-001" \
  "task-001" \
  "Design a caching strategy for the API"

Run Multi-Agent System

cd /opt/agent-governance/agents/multi-agent
~/.bun/bin/bun run orchestrator.ts "<objective>" --timeout 120

# Example
~/.bun/bin/bun run orchestrator.ts \
  "Design a distributed event system" \
  --timeout 90

Run Tests

cd /opt/agent-governance/testing
~/.bun/bin/bun run framework.ts

Check Agent State (DragonflyDB)

# Connect to DragonflyDB
redis-cli -p 6379

# Check agent state
GET agent:<agent_id>:state
HGETALL agent:<agent_id>:errors
GET agent:<agent_id>:packet

# Check blackboard
HGETALL blackboard:<task_id>:solutions
HGETALL blackboard:<task_id>:progress

# Check metrics
HGETALL metrics:<task_id>

Check Vault Secrets

# Get root token
export VAULT_TOKEN=$(cat /opt/vault/init-keys.json | jq -r .root_token)

# List secrets
curl -sk -H "X-Vault-Token: $VAULT_TOKEN" \
  https://127.0.0.1:8200/v1/secret/metadata | jq .

# Read a secret
curl -sk -H "X-Vault-Token: $VAULT_TOKEN" \
  https://127.0.0.1:8200/v1/secret/data/api-keys/openrouter | jq .data.data

Development Patterns

Adding a New Agent

// 1. Extend BaseAgent or implement from scratch
import { BaseAgent } from "./agents";

class MyAgent extends BaseAgent {
  constructor(taskId: string, /* dependencies */) {
    super("MY_ROLE", taskId, /* ... */);
  }

  // Handle incoming messages
  protected async handleMessage(msg: AgentMessage): Promise<void> {
    await super.handleMessage(msg);
    switch (msg.type) {
      case "PROPOSAL": await this.handleProposal(msg); break;
      case "QUERY": await this.handleQuery(msg); break;
    }
  }

  // Main execution loop
  async run(task: TaskDefinition): Promise<void> {
    await this.updateState({ status: "WORKING" });

    // Phase 1: Initialize
    // Phase 2: Process
    // Phase 3: Complete

    await this.updateState({ status: "COMPLETED" });
  }
}

Writing to the Blackboard

// Write a solution proposal
await this.writeToBlackboard("solutions", proposalId, {
  name: "My Solution",
  approach: "...",
  confidence: 0.8,
});

// Read problem analysis
const analysis = await this.readFromBlackboard("problem", "analysis");

// Record a vote
await this.blackboard.recordVote({
  agent: this.role,
  proposal_id: proposalId,
  vote: "ACCEPT",
  reasoning: "...",
  timestamp: now(),
});

Sending Direct Messages

// Send proposal to specific agent
await this.sendMessage("BETA", "PROPOSAL", {
  proposal_id: id,
  proposal: data,
});

// Broadcast to all agents
await this.sendMessage("ALL", "SYNC", {
  event: "STATUS_UPDATE",
  progress: 0.5,
});

// Respond to a query
await this.sendMessage(msg.from as AgentRole, "RESPONSE", {
  answer: result,
}, msg.id);  // correlation_id

Error Handling

// Record an error
const counts = await this.gov.recordError(
  this.agentId,
  "LLM_ERROR",
  error.message
);

// Check error budget
const [ok, reason] = await this.gov.checkErrorBudget(this.agentId);
if (!ok) {
  await this.gov.revokeAgent(
    this.agentId,
    "ERROR_BUDGET_EXCEEDED",
    reason!
  );
  return false;
}

// Record a procedure violation
await this.gov.recordViolation(this.agentId, "EXECUTE_WITHOUT_PLAN");

Testing Patterns

Unit Test with Mocks

import { describe, it, expect } from "bun:test";
import { MockDragonfly, MockLLM, createTestContext } from "./framework";

describe("MyAgent", () => {
  it("should handle proposals correctly", async () => {
    const ctx = createTestContext();

    // Set up mock LLM response
    ctx.mockLLM.setResponse("evaluate", JSON.stringify({
      score: 0.85,
      accepted: true,
    }));

    // Set up state in mock DragonflyDB
    await ctx.mockDragonfly.set(
      `agent:${ctx.agentId}:packet`,
      JSON.stringify(/* packet */)
    );

    // Run test
    // ... agent execution ...

    // Assert
    const state = await ctx.mockDragonfly.get(`agent:${ctx.agentId}:state`);
    expect(JSON.parse(state).status).toBe("COMPLETED");
  });
});

Integration Test

describe("Agent + Vault Integration", () => {
  it("should authenticate with Vault", async () => {
    // Use real Vault but test environment
    const agent = new GovernedAgent("test-agent");

    const [ok, msg] = await agent.bootstrap();
    expect(ok).toBe(true);

    await agent.cleanup();
  });
});

Scenario Test

const scenario: TestScenario = {
  name: "Multi-Agent Consensus",
  description: "ALPHA and BETA reach consensus",
  setup: async () => {
    // Pre-populate blackboard
  },
  execute: async (ctx) => {
    const orchestrator = new MultiAgentOrchestrator();
    await orchestrator.initialize();
    await orchestrator.runTask(task);
  },
  assertions: async (ctx) => {
    const metrics = await getMetrics(ctx.taskId);
    expect(metrics.final_consensus).toBe(true);
  },
  cleanup: async () => {
    // Clean up DragonflyDB keys
  },
};

Key Files Reference

Governance Manager (TypeScript)

File: agents/llm-planner-ts/governed-agent.ts
Class: GovernanceManager

Key Methods:
- connect() / disconnect()
- createPacket(packet)
- getPacket(agentId)
- setState(state) / getState(agentId)
- acquireLock(agentId) / releaseLock(agentId)
- heartbeat(agentId)
- recordError(agentId, type, message)
- checkErrorBudget(agentId)
- revokeAgent(agentId, reason, details)
- registerArtifact(taskId, type, reference)

Multi-Agent Coordination

File: agents/multi-agent/coordination.ts

Classes:
- Blackboard         # Shared memory
- MessageBus         # Direct messaging
- AgentStateManager  # State tracking
- SpawnController    # Conditional spawning
- MetricsCollector   # Performance metrics

Agent Implementations

File: agents/multi-agent/agents.ts

Classes:
- BaseAgent     # Abstract base
- AgentAlpha    # Research/Analysis
- AgentBeta     # Implementation/Synthesis
- AgentGamma    # Mediator (conditionally spawned)

Debugging Tips

Enable Verbose Logging

// Add to agent constructor
this.log = (msg: string) => {
  const elapsed = ((Date.now() - this.startTime) / 1000).toFixed(3);
  console.log(`[${elapsed}s] [${this.role}] [DEBUG] ${msg}`);
  console.log(`  State: ${JSON.stringify(this.state)}`);
};

Inspect DragonflyDB State

# Watch all keys in real-time
redis-cli -p 6379 MONITOR

# Dump all agent keys
redis-cli -p 6379 KEYS "agent:*"
redis-cli -p 6379 KEYS "blackboard:*"
redis-cli -p 6379 KEYS "msg:*"

Check Ledger

sqlite3 /opt/agent-governance/ledger/governance.db \
  "SELECT * FROM agent_actions ORDER BY timestamp DESC LIMIT 10"

Common Issues

Issue	Cause	Fix
`CANNOT_ACQUIRE_LOCK`	Previous agent didn't release	`DEL agent:<id>:lock`
`NO_INSTRUCTION_PACKET`	Packet not created	Create packet before running
`AGENT_PREVIOUSLY_REVOKED`	Agent in revocation ledger	Use new agent ID
`LLM_ERROR`	OpenRouter API issue	Check API key in Vault
JSON parse error	LLM returned markdown	Improve JSON extraction

Performance Tuning

LLM Call Optimization

// Use lower max_tokens for simple tasks
const response = await this.llm.chat.completions.create({
  model: this.model,
  messages: [...],
  max_tokens: task.complexity === "low" ? 1000 : 4000,
  temperature: 0.3,  // Lower = more deterministic
});

Reduce Blackboard Overhead

// Batch writes
const entries = [...];
for (const entry of entries) {
  await this.blackboard.write(section, key, value, this.role);
}

// Read section once instead of multiple reads
const solutions = await this.blackboard.readSection("solutions");
// Instead of multiple readFromBlackboard calls

Monitor Interval Tuning

// Adjust based on task duration
const monitorIntervalMs = task.timeout_seconds > 60 ? 5000 : 2000;
this.monitorInterval = setInterval(() => {
  this.monitorConditions();
}, monitorIntervalMs);

9.5 KiB Raw Blame History

Engineering Quick Reference Guide

Getting Started

Prerequisites

Project Structure

Quick Commands

Run Single Agent

Run Multi-Agent System

Run Tests

Check Agent State (DragonflyDB)

Check Vault Secrets

Development Patterns

Adding a New Agent

Writing to the Blackboard

Sending Direct Messages

Error Handling

Testing Patterns

Unit Test with Mocks

Integration Test

Scenario Test

Key Files Reference

Governance Manager (TypeScript)

Multi-Agent Coordination

Agent Implementations

Debugging Tips

Enable Verbose Logging

Inspect DragonflyDB State

Check Ledger

Common Issues

Performance Tuning

LLM Call Optimization

Reduce Blackboard Overhead

Monitor Interval Tuning

Useful Links

9.5 KiB

Raw Blame History