profit 8c6e7831e9 Add Phase 10-12 implementation: multi-tenant, marketplace, observability
Major additions:
- marketplace/: Agent template registry with FTS5 search, ratings, versioning
- observability/: Prometheus metrics, distributed tracing, structured logging
- ledger/migrations/: Database migration scripts for multi-tenant support
- tests/governance/: 15 new test files for phases 6-12 (295 total tests)
- bin/validate-phases: Full 12-phase validation script

New features:
- Multi-tenant support with tenant isolation and quota enforcement
- Agent marketplace with semantic versioning and search
- Observability with metrics, tracing, and log correlation
- Tier-1 agent bootstrap scripts

Updated components:
- ledger/api.py: Extended API for tenants, marketplace, observability
- ledger/schema.sql: Added tenant, project, marketplace tables
- testing/framework.ts: Enhanced test framework
- checkpoint/checkpoint.py: Improved checkpoint management

Archived:
- External integrations (Slack/GitHub/PagerDuty) moved to .archive/
- Old checkpoint files cleaned up

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:39:47 -05:00

158 lines
4.4 KiB
Markdown

# Testing
> Test utilities, mocks, and architectural oversight for the Agent Governance System
## Overview
This directory provides two complementary testing systems:
1. **framework.ts** - TypeScript testing framework with mocks for agent development
2. **oversight/** - Python architectural test pipeline for continuous system validation
## Quick Start
### TypeScript Framework
```bash
cd /opt/agent-governance/testing
# Default: REAL mode - requires Vault and DragonflyDB
bun run framework.ts
# Explicit mock mode - uses mocks with clear warnings
bun run framework.ts --use-mocks
# Validate services without running tests
bun run framework.ts --validate-only
# Hybrid mode - real where available, mocks otherwise
bun run framework.ts --hybrid
```
**Important:** Tests fail by default if real services are unavailable. Use `--use-mocks` to explicitly enable mock mode.
### Python Oversight Pipeline
```bash
# Run full validation
cd /opt/agent-governance
python3 -c "from testing.oversight import ArchitecturalTestPipeline; print(ArchitecturalTestPipeline().run())"
# Quick validation
python3 -c "from testing.oversight import ArchitecturalTestPipeline; print(ArchitecturalTestPipeline().run_quick_validation())"
```
## Components
### framework.ts (1158 lines)
TypeScript testing framework with Bun-native test support and explicit mock control.
| Class | Description |
|-------|-------------|
| `MockVault` | Simulates HashiCorp Vault (secrets, tokens, policies) |
| `MockDragonfly` | Simulates DragonflyDB (strings, hashes, lists, pub/sub) |
| `RealDragonfly` | Real DragonflyDB client for integration tests |
| `MockLLM` | Simulates LLM responses with latency/failure injection |
| `TestHarness` | Runs test scenarios with mode awareness |
| `TestContext` | Shared context tracking mock usage |
| Function | Description |
|----------|-------------|
| `validateServices()` | Check Vault, DragonflyDB, required files |
| `createTestContext()` | Create context, fails if REAL mode + services unavailable |
| Mode | Behavior |
|------|----------|
| `REAL` (default) | Fails if services unavailable |
| `MOCK` (`--use-mocks`) | Uses mocks with clear warnings |
| `HYBRID` (`--hybrid`) | Real where available, mocks otherwise |
**Pre-built Scenarios:**
- `happyPath` - Agent completes successfully
- `errorBudgetExceeded` - Agent revoked on errors
- `stuckDetection` - GAMMA spawn when stuck
- `conflictResolution` - Multi-proposal conflict
### oversight/ (~4000 lines)
Python architectural test pipeline for multi-layer oversight.
| Module | Lines | Description |
|--------|-------|-------------|
| `pipeline.py` | 476 | Main orchestrator |
| `bug_watcher.py` | 713 | Real-time anomaly detection |
| `suggestion_engine.py` | 656 | AI-driven fix recommendations |
| `council.py` | 648 | Multi-agent decision making |
| `phase_validator.py` | 640 | Phase coverage validation |
| `error_injector.py` | 576 | Controlled fault injection |
| `reporter.py` | 455 | Comprehensive reporting |
See [oversight/README.md](./oversight/README.md) for detailed documentation.
## Usage Examples
### Creating Test Context
```typescript
import { createTestContext, generateInstructionPacket } from './framework';
const ctx = createTestContext();
const packet = generateInstructionPacket('task-1', 'agent-1', 'Test objective');
// Set up mock responses
ctx.mockLLM.setResponse('plan', '{"confidence": 0.9}');
ctx.mockVault.setSecret('test/key', { value: 'secret' });
await ctx.mockDragonfly.set('key', 'value');
```
### Running Oversight Pipeline
```python
from testing.oversight import ArchitecturalTestPipeline
pipeline = ArchitecturalTestPipeline()
# Full validation
report = pipeline.run()
# Validate specific phase
result = pipeline.validate_phase(5) # Phase 5: Agent Bootstrapping
# Quick status check
status = pipeline.get_status()
```
### Error Injection Testing
```python
from testing.oversight import ErrorInjector
injector = ErrorInjector(safe_mode=True) # Won't modify files
injector.inject('missing_config')
# ... run tests ...
injector.cleanup()
```
## Test Results
| Suite | Passed | Failed | Coverage |
|-------|--------|--------|----------|
| framework.ts | 4 | 0 | 100% |
| oversight imports | 7 | 0 | 100% |
## Status
**COMPLETE**
See [STATUS.md](./STATUS.md) for detailed progress tracking.
## Architecture Reference
Part of the [Agent Governance System](/opt/agent-governance/docs/ARCHITECTURE.md).
Parent: [Project Root](/opt/agent-governance)
---
*Last updated: 2026-01-24*