History

profit 8c6e7831e9 Add Phase 10-12 implementation: multi-tenant, marketplace, observability

Major additions:
- marketplace/: Agent template registry with FTS5 search, ratings, versioning
- observability/: Prometheus metrics, distributed tracing, structured logging
- ledger/migrations/: Database migration scripts for multi-tenant support
- tests/governance/: 15 new test files for phases 6-12 (295 total tests)
- bin/validate-phases: Full 12-phase validation script

New features:
- Multi-tenant support with tenant isolation and quota enforcement
- Agent marketplace with semantic versioning and search
- Observability with metrics, tracing, and log correlation
- Tier-1 agent bootstrap scripts

Updated components:
- ledger/api.py: Extended API for tenants, marketplace, observability
- ledger/schema.sql: Added tenant, project, marketplace tables
- testing/framework.ts: Enhanced test framework
- checkpoint/checkpoint.py: Improved checkpoint management

Archived:
- External integrations (Slack/GitHub/PagerDuty) moved to .archive/
- Old checkpoint files cleaned up

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-24 18:39:47 -05:00

oversight

Add Phase 10-12 implementation: multi-tenant, marketplace, observability

2026-01-24 18:39:47 -05:00

framework.ts

Add Phase 10-12 implementation: multi-tenant, marketplace, observability

2026-01-24 18:39:47 -05:00

README.md

Add Phase 10-12 implementation: multi-tenant, marketplace, observability

2026-01-24 18:39:47 -05:00

STATUS.md

Add Phase 10-12 implementation: multi-tenant, marketplace, observability

2026-01-24 18:39:47 -05:00

README.md

Testing

Test utilities, mocks, and architectural oversight for the Agent Governance System

Overview

This directory provides two complementary testing systems:

framework.ts - TypeScript testing framework with mocks for agent development
oversight/ - Python architectural test pipeline for continuous system validation

Quick Start

TypeScript Framework

cd /opt/agent-governance/testing

# Default: REAL mode - requires Vault and DragonflyDB
bun run framework.ts

# Explicit mock mode - uses mocks with clear warnings
bun run framework.ts --use-mocks

# Validate services without running tests
bun run framework.ts --validate-only

# Hybrid mode - real where available, mocks otherwise
bun run framework.ts --hybrid

Important: Tests fail by default if real services are unavailable. Use --use-mocks to explicitly enable mock mode.

Python Oversight Pipeline

# Run full validation
cd /opt/agent-governance
python3 -c "from testing.oversight import ArchitecturalTestPipeline; print(ArchitecturalTestPipeline().run())"

# Quick validation
python3 -c "from testing.oversight import ArchitecturalTestPipeline; print(ArchitecturalTestPipeline().run_quick_validation())"

Components

framework.ts (1158 lines)

TypeScript testing framework with Bun-native test support and explicit mock control.

Class	Description
`MockVault`	Simulates HashiCorp Vault (secrets, tokens, policies)
`MockDragonfly`	Simulates DragonflyDB (strings, hashes, lists, pub/sub)
`RealDragonfly`	Real DragonflyDB client for integration tests
`MockLLM`	Simulates LLM responses with latency/failure injection
`TestHarness`	Runs test scenarios with mode awareness
`TestContext`	Shared context tracking mock usage

Function	Description
`validateServices()`	Check Vault, DragonflyDB, required files
`createTestContext()`	Create context, fails if REAL mode + services unavailable

Mode	Behavior
`REAL` (default)	Fails if services unavailable
`MOCK` (`--use-mocks`)	Uses mocks with clear warnings
`HYBRID` (`--hybrid`)	Real where available, mocks otherwise

Pre-built Scenarios:

happyPath - Agent completes successfully
errorBudgetExceeded - Agent revoked on errors
stuckDetection - GAMMA spawn when stuck
conflictResolution - Multi-proposal conflict

oversight/ (~4000 lines)

Python architectural test pipeline for multi-layer oversight.

Module	Lines	Description
`pipeline.py`	476	Main orchestrator
`bug_watcher.py`	713	Real-time anomaly detection
`suggestion_engine.py`	656	AI-driven fix recommendations
`council.py`	648	Multi-agent decision making
`phase_validator.py`	640	Phase coverage validation
`error_injector.py`	576	Controlled fault injection
`reporter.py`	455	Comprehensive reporting

See oversight/README.md for detailed documentation.

Usage Examples

Creating Test Context

import { createTestContext, generateInstructionPacket } from './framework';

const ctx = createTestContext();
const packet = generateInstructionPacket('task-1', 'agent-1', 'Test objective');

// Set up mock responses
ctx.mockLLM.setResponse('plan', '{"confidence": 0.9}');
ctx.mockVault.setSecret('test/key', { value: 'secret' });
await ctx.mockDragonfly.set('key', 'value');

Running Oversight Pipeline

from testing.oversight import ArchitecturalTestPipeline

pipeline = ArchitecturalTestPipeline()

# Full validation
report = pipeline.run()

# Validate specific phase
result = pipeline.validate_phase(5)  # Phase 5: Agent Bootstrapping

# Quick status check
status = pipeline.get_status()

Error Injection Testing

from testing.oversight import ErrorInjector

injector = ErrorInjector(safe_mode=True)  # Won't modify files
injector.inject('missing_config')
# ... run tests ...
injector.cleanup()

Test Results

Suite	Passed	Failed	Coverage
framework.ts	4	0	100%
oversight imports	7	0	100%

Status

COMPLETE

See STATUS.md for detailed progress tracking.

Architecture Reference

Part of the Agent Governance System.

Parent: Project Root

Last updated: 2026-01-24