# Architectural Test Pipeline > Multi-layer oversight system ensuring no single hidden bug can compromise the Agent Governance System. ## Overview The Architectural Test Pipeline provides continuous validation across all 12 phases through multiple oversight layers that monitor, analyze, review, and report on system health. ## Architecture ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ ARCHITECTURAL TEST PIPELINE │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ │ │ Bug Window │───▶│ Suggestion │───▶│ Council │ │ │ │ Watcher │ │ Engine │ │ Review │ │ │ │ │ │ │ │ │ │ │ │ • Real-time │ │ • Context-aware │ │ • Safety │ │ │ │ • All phases │ │ • Risk-ranked │ │ • Performance │ │ │ │ • Anomalies │ │ • Auto-fixable │ │ • Architecture │ │ │ └────────┬────────┘ └────────┬────────┘ │ • Compliance │ │ │ │ │ │ • Quality │ │ │ │ │ └────────┬────────┘ │ │ │ │ │ │ │ ▼ ▼ ▼ │ │ ┌─────────────────────────────────────────────────────────────┐ │ │ │ Phase Validator │ │ │ │ Phase 1 ✅ │ Phase 2 ✅ │ Phase 3 ✅ │ Phase 4 ✅ │ ... │ │ │ │ Phase 5 ⭐ │ Phase 6 ✅ │ Phase 7 ✅ │ Phase 8 🚧 │ ... │ │ │ └─────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────┐ ┌─────────────────┐ │ │ │ Error Injector │ │ Reporter │ │ │ │ │ │ │ │ │ │ • Safe mode │ │ • Markdown │ │ │ │ • Scenarios │ │ • Per-phase │ │ │ │ • Validation │ │ • Actions │ │ │ └─────────────────┘ └─────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` ## Oversight Layers ### 1. Bug Window Watcher (`bug_watcher.py`) Real-time monitoring of every pipeline stage. **Features:** - Monitors all 12 phases continuously - Detects anomalies: errors, regressions, missing artifacts, state inconsistencies - Links findings to phase, directory, STATUS.md, and checkpoint entries - Persists to DragonflyDB for cross-session tracking **Anomaly Types:** | Type | Description | Severity Range | |------|-------------|----------------| | UNHANDLED_ERROR | Uncaught exceptions | Medium-Critical | | REGRESSION | Behavior change from baseline | High | | MISSING_ARTIFACT | Required file/config missing | Low-High | | STATE_INCONSISTENCY | Status mismatch | Medium | | DEPENDENCY_UNAVAILABLE | Vault/Dragonfly/Ledger down | Critical | | SECURITY_VIOLATION | Unacknowledged violation | Critical | ### 2. Suggestion Engine (`suggestion_engine.py`) AI-driven analysis using historical context. **Features:** - Gathers context from checkpoints, memory, STATUS files - Pattern-based suggestions from known fixes - Context-aware suggestions from historical outcomes - Risk/impact ranking for prioritization **Suggestion Ranking:** ``` Priority Score = Impact × (1 - Risk) Impact Levels: transformative (1.0) > high (0.8) > medium (0.6) > low (0.4) Risk Levels: critical (0.2) < high (0.4) < medium (0.6) < low (0.8) ``` ### 3. Council Review (`council.py`) Multi-perspective review with 5 specialized reviewers. **Reviewers:** | Role | Focus | Risk Tolerance | |------|-------|----------------| | Safety | Security, access control | Very Low (0.2) | | Performance | Latency, throughput | Medium (0.6) | | Architecture | Design, maintainability | Medium (0.5) | | Compliance | Governance, policies | Low (0.3) | | Quality | Testing, documentation | Low (0.4) | **Decision Types:** - `AUTO_APPROVE` - Safe to auto-implement - `HUMAN_APPROVE` - Approved, needs human - `DEFER` - Needs more discussion - `REJECT` - Do not implement - `ESCALATE` - Needs higher authority ### 4. Phase Validator (`phase_validator.py`) Ensures all phases have required components. **Validation Levels:** | Level | Description | |-------|-------------| | BASIC | Existence checks only | | STANDARD | + Functionality tests | | THOROUGH | + Integration tests | | COMPREHENSIVE | + Chaos/edge cases | **Special Attention: Phase 5** Phase 5 (Agent Bootstrapping) receives extra validation as the current focus. ### 5. Error Injector (`error_injector.py`) Controlled fault injection for testing oversight. **Predefined Scenarios:** | Scenario | Type | Target | |----------|------|--------| | missing_config | MISSING_FILE | agents/tier0-agent/config/agent.json | | corrupted_status | INVALID_STATUS | checkpoint/STATUS.md | | stale_checkpoint | STATE_INCONSISTENCY | checkpoint/storage/ | | redis_key_missing | DEPENDENCY_FAILURE | oversight:watcher | | violation_unacked | SECURITY_VIOLATION | violations table | | blocked_directory | STATE_INCONSISTENCY | preflight/ | **Safe Mode:** By default, injections are simulated without modifying files. ### 6. Reporter (`reporter.py`) Comprehensive reporting with actionable insights. **Report Sections:** - Executive Summary - Phase Status Matrix - Bug Watcher Summary - Suggestion Engine Summary - Council Decisions - Injection Test Results - Pending Actions - Critical Issues ## CLI Usage ```bash # Full pipeline execution oversight run # With injection tests oversight run --inject # Focus on specific phase oversight run --phase 5 # Quick validation oversight quick # Validate specific phase in detail oversight validate --phase 5 # Generate report only oversight report # Show phase matrix oversight matrix # JSON output oversight run --json ``` ## Integration with Checkpoints/Status/Memory ### Checkpoints - Watcher checks checkpoint consistency and staleness - Suggestions reference recent checkpoint context - Reports include checkpoint link ### STATUS.md Files - Watcher scans STATUS.md for BLOCKED states - Phase validator checks STATUS.md existence - Reports track per-directory status ### Memory Layer - Suggestion engine queries memory for related entries - Context gathered from summaries directory - Report counts available memory entries ## Running Tests ### Injection Test Suite ```bash # Run all injection scenarios oversight run --inject # Or use injector directly cd /opt/agent-governance/testing/oversight python -m testing.oversight.error_injector test-all ``` ### Expected Results A healthy system should: 1. Detect all injected errors (100% detection rate) 2. Generate relevant suggestions (accurate quality) 3. Produce council decisions for each suggestion 4. Pass all injection tests ## Extending the Pipeline ### Adding a New Anomaly Type 1. Add to `AnomalyType` enum in `bug_watcher.py` 2. Add detection logic in `_run_phase_specific_checks()` 3. Add fix patterns in `SuggestionEngine.FIX_PATTERNS` ### Adding a New Council Reviewer 1. Add role to `ReviewerRole` enum in `council.py` 2. Create `ReviewerProfile` in `REVIEWERS` dict 3. Implement `__review()` method ### Adding a New Injection Scenario 1. Add to `SCENARIOS` dict in `error_injector.py` 2. Implement injection/cleanup in `_perform_injection()` ## File Structure ``` testing/oversight/ ├── __init__.py # Package exports ├── pipeline.py # Main orchestrator ├── bug_watcher.py # Real-time anomaly detection ├── suggestion_engine.py # Fix recommendations ├── council.py # Multi-agent review ├── phase_validator.py # Phase coverage ├── error_injector.py # Fault injection ├── reporter.py # Report generation ├── README.md # This file └── reports/ # Generated reports ``` ## Example Report ``` # Architectural Test Pipeline Report **Generated:** 2026-01-23T12:00:00Z **Report ID:** rpt-20260123-120000 ## Executive Summary - **Phases Validated:** 12 - **Average Coverage:** 75.3% - **Total Anomalies:** 8 - **Critical Gaps:** 2 ## Phase Status Matrix | Phase | Name | Status | Coverage | Bugs | |-------|------|--------|----------|------| | 1 | Foundation | ✅ complete | 95.0% | 0 | | 5 | Agent Bootstrapping | 🚧 in_progress | 80.0% | 2 | | 8 | Production Hardening | ❌ blocked | 40.0% | 3 | ... ``` ## Troubleshooting ### Pipeline Fails to Start - Verify DragonflyDB is running: `redis-cli -p 6379 -a governance2026 PING` - Check Vault status: `docker exec vault vault status` ### No Anomalies Detected - Ensure STATUS.md files exist in directories - Check checkpoint storage has recent entries ### Injection Tests Fail - Verify safe mode is enabled (default) - Check file permissions in target directories ## Related Documentation - [CONTEXT_MANAGEMENT.md](../../docs/CONTEXT_MANAGEMENT.md) - Checkpoints and STATUS - [MEMORY_LAYER.md](../../docs/MEMORY_LAYER.md) - External memory - [STATUS_PROTOCOL.md](../../docs/STATUS_PROTOCOL.md) - Directory status protocol