Major additions: - marketplace/: Agent template registry with FTS5 search, ratings, versioning - observability/: Prometheus metrics, distributed tracing, structured logging - ledger/migrations/: Database migration scripts for multi-tenant support - tests/governance/: 15 new test files for phases 6-12 (295 total tests) - bin/validate-phases: Full 12-phase validation script New features: - Multi-tenant support with tenant isolation and quota enforcement - Agent marketplace with semantic versioning and search - Observability with metrics, tracing, and log correlation - Tier-1 agent bootstrap scripts Updated components: - ledger/api.py: Extended API for tenants, marketplace, observability - ledger/schema.sql: Added tenant, project, marketplace tables - testing/framework.ts: Enhanced test framework - checkpoint/checkpoint.py: Improved checkpoint management Archived: - External integrations (Slack/GitHub/PagerDuty) moved to .archive/ - Old checkpoint files cleaned up Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Testing
Test utilities, mocks, and architectural oversight for the Agent Governance System
Overview
This directory provides two complementary testing systems:
- framework.ts - TypeScript testing framework with mocks for agent development
- oversight/ - Python architectural test pipeline for continuous system validation
Quick Start
TypeScript Framework
cd /opt/agent-governance/testing
# Default: REAL mode - requires Vault and DragonflyDB
bun run framework.ts
# Explicit mock mode - uses mocks with clear warnings
bun run framework.ts --use-mocks
# Validate services without running tests
bun run framework.ts --validate-only
# Hybrid mode - real where available, mocks otherwise
bun run framework.ts --hybrid
Important: Tests fail by default if real services are unavailable. Use --use-mocks to explicitly enable mock mode.
Python Oversight Pipeline
# Run full validation
cd /opt/agent-governance
python3 -c "from testing.oversight import ArchitecturalTestPipeline; print(ArchitecturalTestPipeline().run())"
# Quick validation
python3 -c "from testing.oversight import ArchitecturalTestPipeline; print(ArchitecturalTestPipeline().run_quick_validation())"
Components
framework.ts (1158 lines)
TypeScript testing framework with Bun-native test support and explicit mock control.
| Class | Description |
|---|---|
MockVault |
Simulates HashiCorp Vault (secrets, tokens, policies) |
MockDragonfly |
Simulates DragonflyDB (strings, hashes, lists, pub/sub) |
RealDragonfly |
Real DragonflyDB client for integration tests |
MockLLM |
Simulates LLM responses with latency/failure injection |
TestHarness |
Runs test scenarios with mode awareness |
TestContext |
Shared context tracking mock usage |
| Function | Description |
|---|---|
validateServices() |
Check Vault, DragonflyDB, required files |
createTestContext() |
Create context, fails if REAL mode + services unavailable |
| Mode | Behavior |
|---|---|
REAL (default) |
Fails if services unavailable |
MOCK (--use-mocks) |
Uses mocks with clear warnings |
HYBRID (--hybrid) |
Real where available, mocks otherwise |
Pre-built Scenarios:
happyPath- Agent completes successfullyerrorBudgetExceeded- Agent revoked on errorsstuckDetection- GAMMA spawn when stuckconflictResolution- Multi-proposal conflict
oversight/ (~4000 lines)
Python architectural test pipeline for multi-layer oversight.
| Module | Lines | Description |
|---|---|---|
pipeline.py |
476 | Main orchestrator |
bug_watcher.py |
713 | Real-time anomaly detection |
suggestion_engine.py |
656 | AI-driven fix recommendations |
council.py |
648 | Multi-agent decision making |
phase_validator.py |
640 | Phase coverage validation |
error_injector.py |
576 | Controlled fault injection |
reporter.py |
455 | Comprehensive reporting |
See oversight/README.md for detailed documentation.
Usage Examples
Creating Test Context
import { createTestContext, generateInstructionPacket } from './framework';
const ctx = createTestContext();
const packet = generateInstructionPacket('task-1', 'agent-1', 'Test objective');
// Set up mock responses
ctx.mockLLM.setResponse('plan', '{"confidence": 0.9}');
ctx.mockVault.setSecret('test/key', { value: 'secret' });
await ctx.mockDragonfly.set('key', 'value');
Running Oversight Pipeline
from testing.oversight import ArchitecturalTestPipeline
pipeline = ArchitecturalTestPipeline()
# Full validation
report = pipeline.run()
# Validate specific phase
result = pipeline.validate_phase(5) # Phase 5: Agent Bootstrapping
# Quick status check
status = pipeline.get_status()
Error Injection Testing
from testing.oversight import ErrorInjector
injector = ErrorInjector(safe_mode=True) # Won't modify files
injector.inject('missing_config')
# ... run tests ...
injector.cleanup()
Test Results
| Suite | Passed | Failed | Coverage |
|---|---|---|---|
| framework.ts | 4 | 0 | 100% |
| oversight imports | 7 | 0 | 100% |
Status
COMPLETE
See STATUS.md for detailed progress tracking.
Architecture Reference
Part of the Agent Governance System.
Parent: Project Root
Last updated: 2026-01-24