profit 8c6e7831e9 Add Phase 10-12 implementation: multi-tenant, marketplace, observability
Major additions:
- marketplace/: Agent template registry with FTS5 search, ratings, versioning
- observability/: Prometheus metrics, distributed tracing, structured logging
- ledger/migrations/: Database migration scripts for multi-tenant support
- tests/governance/: 15 new test files for phases 6-12 (295 total tests)
- bin/validate-phases: Full 12-phase validation script

New features:
- Multi-tenant support with tenant isolation and quota enforcement
- Agent marketplace with semantic versioning and search
- Observability with metrics, tracing, and log correlation
- Tier-1 agent bootstrap scripts

Updated components:
- ledger/api.py: Extended API for tenants, marketplace, observability
- ledger/schema.sql: Added tenant, project, marketplace tables
- testing/framework.ts: Enhanced test framework
- checkpoint/checkpoint.py: Improved checkpoint management

Archived:
- External integrations (Slack/GitHub/PagerDuty) moved to .archive/
- Old checkpoint files cleaned up

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:39:47 -05:00
..

Testing

Test utilities, mocks, and architectural oversight for the Agent Governance System

Overview

This directory provides two complementary testing systems:

  1. framework.ts - TypeScript testing framework with mocks for agent development
  2. oversight/ - Python architectural test pipeline for continuous system validation

Quick Start

TypeScript Framework

cd /opt/agent-governance/testing

# Default: REAL mode - requires Vault and DragonflyDB
bun run framework.ts

# Explicit mock mode - uses mocks with clear warnings
bun run framework.ts --use-mocks

# Validate services without running tests
bun run framework.ts --validate-only

# Hybrid mode - real where available, mocks otherwise
bun run framework.ts --hybrid

Important: Tests fail by default if real services are unavailable. Use --use-mocks to explicitly enable mock mode.

Python Oversight Pipeline

# Run full validation
cd /opt/agent-governance
python3 -c "from testing.oversight import ArchitecturalTestPipeline; print(ArchitecturalTestPipeline().run())"

# Quick validation
python3 -c "from testing.oversight import ArchitecturalTestPipeline; print(ArchitecturalTestPipeline().run_quick_validation())"

Components

framework.ts (1158 lines)

TypeScript testing framework with Bun-native test support and explicit mock control.

Class Description
MockVault Simulates HashiCorp Vault (secrets, tokens, policies)
MockDragonfly Simulates DragonflyDB (strings, hashes, lists, pub/sub)
RealDragonfly Real DragonflyDB client for integration tests
MockLLM Simulates LLM responses with latency/failure injection
TestHarness Runs test scenarios with mode awareness
TestContext Shared context tracking mock usage
Function Description
validateServices() Check Vault, DragonflyDB, required files
createTestContext() Create context, fails if REAL mode + services unavailable
Mode Behavior
REAL (default) Fails if services unavailable
MOCK (--use-mocks) Uses mocks with clear warnings
HYBRID (--hybrid) Real where available, mocks otherwise

Pre-built Scenarios:

  • happyPath - Agent completes successfully
  • errorBudgetExceeded - Agent revoked on errors
  • stuckDetection - GAMMA spawn when stuck
  • conflictResolution - Multi-proposal conflict

oversight/ (~4000 lines)

Python architectural test pipeline for multi-layer oversight.

Module Lines Description
pipeline.py 476 Main orchestrator
bug_watcher.py 713 Real-time anomaly detection
suggestion_engine.py 656 AI-driven fix recommendations
council.py 648 Multi-agent decision making
phase_validator.py 640 Phase coverage validation
error_injector.py 576 Controlled fault injection
reporter.py 455 Comprehensive reporting

See oversight/README.md for detailed documentation.

Usage Examples

Creating Test Context

import { createTestContext, generateInstructionPacket } from './framework';

const ctx = createTestContext();
const packet = generateInstructionPacket('task-1', 'agent-1', 'Test objective');

// Set up mock responses
ctx.mockLLM.setResponse('plan', '{"confidence": 0.9}');
ctx.mockVault.setSecret('test/key', { value: 'secret' });
await ctx.mockDragonfly.set('key', 'value');

Running Oversight Pipeline

from testing.oversight import ArchitecturalTestPipeline

pipeline = ArchitecturalTestPipeline()

# Full validation
report = pipeline.run()

# Validate specific phase
result = pipeline.validate_phase(5)  # Phase 5: Agent Bootstrapping

# Quick status check
status = pipeline.get_status()

Error Injection Testing

from testing.oversight import ErrorInjector

injector = ErrorInjector(safe_mode=True)  # Won't modify files
injector.inject('missing_config')
# ... run tests ...
injector.cleanup()

Test Results

Suite Passed Failed Coverage
framework.ts 4 0 100%
oversight imports 7 0 100%

Status

COMPLETE

See STATUS.md for detailed progress tracking.

Architecture Reference

Part of the Agent Governance System.

Parent: Project Root


Last updated: 2026-01-24