agent-governance/docs/ENGINEERING_GUIDE.md

# Engineering Quick Reference Guide

## Getting Started

### Prerequisites

```bash
# Check Vault is running
curl -sk https://127.0.0.1:8200/v1/sys/health | jq .

# Check DragonflyDB is running
redis-cli -p 6379 -a $(cat /opt/vault/init-keys.json | jq -r .root_token) PING

# Check Bun is installed
~/.bun/bin/bun --version  # Should be 1.3.6+
```

### Project Structure

```
/opt/agent-governance/
├── docs/                    # Documentation
│   ├── ARCHITECTURE.md      # System architecture
│   └── ENGINEERING_GUIDE.md # This file
├── ledger/                  # Audit trail
│   └── governance.db        # SQLite database
├── runtime/                 # Python governance
│   ├── governance.py        # Core governance manager
│   └── monitors.py          # Monitor agents
├── agents/
│   ├── llm-planner/         # Python agents
│   ├── llm-planner-ts/      # TypeScript single-agent
│   └── multi-agent/         # Multi-agent system
└── testing/
    └── framework.ts         # Test framework
```

---

## Quick Commands

### Run Single Agent

```bash
cd /opt/agent-governance/agents/llm-planner-ts
~/.bun/bin/bun run governed-agent.ts <agent_id> <task_id> "<objective>"

# Example
~/.bun/bin/bun run governed-agent.ts \
  "agent-001" \
  "task-001" \
  "Design a caching strategy for the API"
```

### Run Multi-Agent System

```bash
cd /opt/agent-governance/agents/multi-agent
~/.bun/bin/bun run orchestrator.ts "<objective>" --timeout 120

# Example
~/.bun/bin/bun run orchestrator.ts \
  "Design a distributed event system" \
  --timeout 90
```

### Run Tests

```bash
cd /opt/agent-governance/testing
~/.bun/bin/bun run framework.ts
```

### Check Agent State (DragonflyDB)

```bash
# Connect to DragonflyDB
redis-cli -p 6379

# Check agent state
GET agent:<agent_id>:state
HGETALL agent:<agent_id>:errors
GET agent:<agent_id>:packet

# Check blackboard
HGETALL blackboard:<task_id>:solutions
HGETALL blackboard:<task_id>:progress

# Check metrics
HGETALL metrics:<task_id>
```

### Check Vault Secrets

```bash
# Get root token
export VAULT_TOKEN=$(cat /opt/vault/init-keys.json | jq -r .root_token)

# List secrets
curl -sk -H "X-Vault-Token: $VAULT_TOKEN" \
  https://127.0.0.1:8200/v1/secret/metadata | jq .

# Read a secret
curl -sk -H "X-Vault-Token: $VAULT_TOKEN" \
  https://127.0.0.1:8200/v1/secret/data/api-keys/openrouter | jq .data.data
```

---

## Development Patterns

### Adding a New Agent

```typescript
// 1. Extend BaseAgent or implement from scratch
import { BaseAgent } from "./agents";

class MyAgent extends BaseAgent {
  constructor(taskId: string, /* dependencies */) {
    super("MY_ROLE", taskId, /* ... */);
  }

  // Handle incoming messages
  protected async handleMessage(msg: AgentMessage): Promise<void> {
    await super.handleMessage(msg);
    switch (msg.type) {
      case "PROPOSAL": await this.handleProposal(msg); break;
      case "QUERY": await this.handleQuery(msg); break;
    }
  }

  // Main execution loop
  async run(task: TaskDefinition): Promise<void> {
    await this.updateState({ status: "WORKING" });

    // Phase 1: Initialize
    // Phase 2: Process
    // Phase 3: Complete

    await this.updateState({ status: "COMPLETED" });
  }
}
```

### Writing to the Blackboard

```typescript
// Write a solution proposal
await this.writeToBlackboard("solutions", proposalId, {
  name: "My Solution",
  approach: "...",
  confidence: 0.8,
});

// Read problem analysis
const analysis = await this.readFromBlackboard("problem", "analysis");

// Record a vote
await this.blackboard.recordVote({
  agent: this.role,
  proposal_id: proposalId,
  vote: "ACCEPT",
  reasoning: "...",
  timestamp: now(),
});
```

### Sending Direct Messages

```typescript
// Send proposal to specific agent
await this.sendMessage("BETA", "PROPOSAL", {
  proposal_id: id,
  proposal: data,
});

// Broadcast to all agents
await this.sendMessage("ALL", "SYNC", {
  event: "STATUS_UPDATE",
  progress: 0.5,
});

// Respond to a query
await this.sendMessage(msg.from as AgentRole, "RESPONSE", {
  answer: result,
}, msg.id);  // correlation_id
```

### Error Handling

```typescript
// Record an error
const counts = await this.gov.recordError(
  this.agentId,
  "LLM_ERROR",
  error.message
);

// Check error budget
const [ok, reason] = await this.gov.checkErrorBudget(this.agentId);
if (!ok) {
  await this.gov.revokeAgent(
    this.agentId,
    "ERROR_BUDGET_EXCEEDED",
    reason!
  );
  return false;
}

// Record a procedure violation
await this.gov.recordViolation(this.agentId, "EXECUTE_WITHOUT_PLAN");
```

---

## Testing Patterns

### Unit Test with Mocks

```typescript
import { describe, it, expect } from "bun:test";
import { MockDragonfly, MockLLM, createTestContext } from "./framework";

describe("MyAgent", () => {
  it("should handle proposals correctly", async () => {
    const ctx = createTestContext();

    // Set up mock LLM response
    ctx.mockLLM.setResponse("evaluate", JSON.stringify({
      score: 0.85,
      accepted: true,
    }));

    // Set up state in mock DragonflyDB
    await ctx.mockDragonfly.set(
      `agent:${ctx.agentId}:packet`,
      JSON.stringify(/* packet */)
    );

    // Run test
    // ... agent execution ...

    // Assert
    const state = await ctx.mockDragonfly.get(`agent:${ctx.agentId}:state`);
    expect(JSON.parse(state).status).toBe("COMPLETED");
  });
});
```

### Integration Test

```typescript
describe("Agent + Vault Integration", () => {
  it("should authenticate with Vault", async () => {
    // Use real Vault but test environment
    const agent = new GovernedAgent("test-agent");

    const [ok, msg] = await agent.bootstrap();
    expect(ok).toBe(true);

    await agent.cleanup();
  });
});
```

### Scenario Test

```typescript
const scenario: TestScenario = {
  name: "Multi-Agent Consensus",
  description: "ALPHA and BETA reach consensus",
  setup: async () => {
    // Pre-populate blackboard
  },
  execute: async (ctx) => {
    const orchestrator = new MultiAgentOrchestrator();
    await orchestrator.initialize();
    await orchestrator.runTask(task);
  },
  assertions: async (ctx) => {
    const metrics = await getMetrics(ctx.taskId);
    expect(metrics.final_consensus).toBe(true);
  },
  cleanup: async () => {
    // Clean up DragonflyDB keys
  },
};
```

---

## Key Files Reference

### Governance Manager (TypeScript)

```
File: agents/llm-planner-ts/governed-agent.ts
Class: GovernanceManager

Key Methods:
- connect() / disconnect()
- createPacket(packet)
- getPacket(agentId)
- setState(state) / getState(agentId)
- acquireLock(agentId) / releaseLock(agentId)
- heartbeat(agentId)
- recordError(agentId, type, message)
- checkErrorBudget(agentId)
- revokeAgent(agentId, reason, details)
- registerArtifact(taskId, type, reference)
```

### Multi-Agent Coordination

```
File: agents/multi-agent/coordination.ts

Classes:
- Blackboard         # Shared memory
- MessageBus         # Direct messaging
- AgentStateManager  # State tracking
- SpawnController    # Conditional spawning
- MetricsCollector   # Performance metrics
```

### Agent Implementations

```
File: agents/multi-agent/agents.ts

Classes:
- BaseAgent     # Abstract base
- AgentAlpha    # Research/Analysis
- AgentBeta     # Implementation/Synthesis
- AgentGamma    # Mediator (conditionally spawned)
```

---

## Debugging Tips

### Enable Verbose Logging

```typescript
// Add to agent constructor
this.log = (msg: string) => {
  const elapsed = ((Date.now() - this.startTime) / 1000).toFixed(3);
  console.log(`[${elapsed}s] [${this.role}] [DEBUG] ${msg}`);
  console.log(`  State: ${JSON.stringify(this.state)}`);
};
```

### Inspect DragonflyDB State

```bash
# Watch all keys in real-time
redis-cli -p 6379 MONITOR

# Dump all agent keys
redis-cli -p 6379 KEYS "agent:*"
redis-cli -p 6379 KEYS "blackboard:*"
redis-cli -p 6379 KEYS "msg:*"
```

### Check Ledger

```bash
sqlite3 /opt/agent-governance/ledger/governance.db \
  "SELECT * FROM agent_actions ORDER BY timestamp DESC LIMIT 10"
```

### Common Issues

| Issue | Cause | Fix |
|-------|-------|-----|
| `CANNOT_ACQUIRE_LOCK` | Previous agent didn't release | `DEL agent:<id>:lock` |
| `NO_INSTRUCTION_PACKET` | Packet not created | Create packet before running |
| `AGENT_PREVIOUSLY_REVOKED` | Agent in revocation ledger | Use new agent ID |
| `LLM_ERROR` | OpenRouter API issue | Check API key in Vault |
| JSON parse error | LLM returned markdown | Improve JSON extraction |

---

## Performance Tuning

### LLM Call Optimization

```typescript
// Use lower max_tokens for simple tasks
const response = await this.llm.chat.completions.create({
  model: this.model,
  messages: [...],
  max_tokens: task.complexity === "low" ? 1000 : 4000,
  temperature: 0.3,  // Lower = more deterministic
});
```

### Reduce Blackboard Overhead

```typescript
// Batch writes
const entries = [...];
for (const entry of entries) {
  await this.blackboard.write(section, key, value, this.role);
}

// Read section once instead of multiple reads
const solutions = await this.blackboard.readSection("solutions");
// Instead of multiple readFromBlackboard calls
```

### Monitor Interval Tuning

```typescript
// Adjust based on task duration
const monitorIntervalMs = task.timeout_seconds > 60 ? 5000 : 2000;
this.monitorInterval = setInterval(() => {
  this.monitorConditions();
}, monitorIntervalMs);
```

---

## Useful Links

- [Bun Documentation](https://bun.sh/docs)
- [Redis Commands](https://redis.io/commands)
- [Vault API](https://developer.hashicorp.com/vault/api-docs)
- [OpenRouter API](https://openrouter.ai/docs)