agent-governance/testing/README.md

# Testing

> Test utilities, mocks, and architectural oversight for the Agent Governance System

## Overview

This directory provides two complementary testing systems:

1. **framework.ts** - TypeScript testing framework with mocks for agent development
2. **oversight/** - Python architectural test pipeline for continuous system validation

## Quick Start

### TypeScript Framework

```bash
cd /opt/agent-governance/testing

# Default: REAL mode - requires Vault and DragonflyDB
bun run framework.ts

# Explicit mock mode - uses mocks with clear warnings
bun run framework.ts --use-mocks

# Validate services without running tests
bun run framework.ts --validate-only

# Hybrid mode - real where available, mocks otherwise
bun run framework.ts --hybrid
```

**Important:** Tests fail by default if real services are unavailable. Use `--use-mocks` to explicitly enable mock mode.

### Python Oversight Pipeline

```bash
# Run full validation
cd /opt/agent-governance
python3 -c "from testing.oversight import ArchitecturalTestPipeline; print(ArchitecturalTestPipeline().run())"

# Quick validation
python3 -c "from testing.oversight import ArchitecturalTestPipeline; print(ArchitecturalTestPipeline().run_quick_validation())"
```

## Components

### framework.ts (1158 lines)

TypeScript testing framework with Bun-native test support and explicit mock control.

| Class | Description |
|-------|-------------|
| `MockVault` | Simulates HashiCorp Vault (secrets, tokens, policies) |
| `MockDragonfly` | Simulates DragonflyDB (strings, hashes, lists, pub/sub) |
| `RealDragonfly` | Real DragonflyDB client for integration tests |
| `MockLLM` | Simulates LLM responses with latency/failure injection |
| `TestHarness` | Runs test scenarios with mode awareness |
| `TestContext` | Shared context tracking mock usage |

| Function | Description |
|----------|-------------|
| `validateServices()` | Check Vault, DragonflyDB, required files |
| `createTestContext()` | Create context, fails if REAL mode + services unavailable |

| Mode | Behavior |
|------|----------|
| `REAL` (default) | Fails if services unavailable |
| `MOCK` (`--use-mocks`) | Uses mocks with clear warnings |
| `HYBRID` (`--hybrid`) | Real where available, mocks otherwise |

**Pre-built Scenarios:**
- `happyPath` - Agent completes successfully
- `errorBudgetExceeded` - Agent revoked on errors
- `stuckDetection` - GAMMA spawn when stuck
- `conflictResolution` - Multi-proposal conflict

### oversight/ (~4000 lines)

Python architectural test pipeline for multi-layer oversight.

| Module | Lines | Description |
|--------|-------|-------------|
| `pipeline.py` | 476 | Main orchestrator |
| `bug_watcher.py` | 713 | Real-time anomaly detection |
| `suggestion_engine.py` | 656 | AI-driven fix recommendations |
| `council.py` | 648 | Multi-agent decision making |
| `phase_validator.py` | 640 | Phase coverage validation |
| `error_injector.py` | 576 | Controlled fault injection |
| `reporter.py` | 455 | Comprehensive reporting |

See [oversight/README.md](./oversight/README.md) for detailed documentation.

## Usage Examples

### Creating Test Context

```typescript
import { createTestContext, generateInstructionPacket } from './framework';

const ctx = createTestContext();
const packet = generateInstructionPacket('task-1', 'agent-1', 'Test objective');

// Set up mock responses
ctx.mockLLM.setResponse('plan', '{"confidence": 0.9}');
ctx.mockVault.setSecret('test/key', { value: 'secret' });
await ctx.mockDragonfly.set('key', 'value');
```

### Running Oversight Pipeline

```python
from testing.oversight import ArchitecturalTestPipeline

pipeline = ArchitecturalTestPipeline()

# Full validation
report = pipeline.run()

# Validate specific phase
result = pipeline.validate_phase(5)  # Phase 5: Agent Bootstrapping

# Quick status check
status = pipeline.get_status()
```

### Error Injection Testing

```python
from testing.oversight import ErrorInjector

injector = ErrorInjector(safe_mode=True)  # Won't modify files
injector.inject('missing_config')
# ... run tests ...
injector.cleanup()
```

## Test Results

| Suite | Passed | Failed | Coverage |
|-------|--------|--------|----------|
| framework.ts | 4 | 0 | 100% |
| oversight imports | 7 | 0 | 100% |

## Status

**COMPLETE**

See [STATUS.md](./STATUS.md) for detailed progress tracking.

## Architecture Reference

Part of the [Agent Governance System](/opt/agent-governance/docs/ARCHITECTURE.md).

Parent: [Project Root](/opt/agent-governance)

---
*Last updated: 2026-01-24*