agent-governance/docs/PRODUCTION_PIPELINE.md
profit 8c6e7831e9 Add Phase 10-12 implementation: multi-tenant, marketplace, observability
Major additions:
- marketplace/: Agent template registry with FTS5 search, ratings, versioning
- observability/: Prometheus metrics, distributed tracing, structured logging
- ledger/migrations/: Database migration scripts for multi-tenant support
- tests/governance/: 15 new test files for phases 6-12 (295 total tests)
- bin/validate-phases: Full 12-phase validation script

New features:
- Multi-tenant support with tenant isolation and quota enforcement
- Agent marketplace with semantic versioning and search
- Observability with metrics, tracing, and log correlation
- Tier-1 agent bootstrap scripts

Updated components:
- ledger/api.py: Extended API for tenants, marketplace, observability
- ledger/schema.sql: Added tenant, project, marketplace tables
- testing/framework.ts: Enhanced test framework
- checkpoint/checkpoint.py: Improved checkpoint management

Archived:
- External integrations (Slack/GitHub/PagerDuty) moved to .archive/
- Old checkpoint files cleaned up

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:39:47 -05:00

12 KiB

Production Pipeline: Report → OpenRouter Orchestration

Overview

This document describes the automatic transition from the UI "view report" stage into the live multi-agent pipeline, including OpenRouter-driven parallel execution.

Created: 2026-01-24 Status: Implemented


Architecture Flow

┌─────────────────────────────────────────────────────────────────────────┐
│                         UI DASHBOARD                                     │
│                                                                          │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────────────┐  │
│  │  SPAWN   │───▶│ RUNNING  │───▶│  REPORT  │───▶│ AUTO-ORCHESTRATE │  │
│  │ Pipeline │    │ Agents   │    │  Stage   │    │   (NEW)          │  │
│  └──────────┘    └──────────┘    └──────────┘    └────────┬─────────┘  │
│                                                           │             │
└───────────────────────────────────────────────────────────┼─────────────┘
                                                            │
                                                            ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                    OPENROUTER ORCHESTRATION                              │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │                    MultiAgentOrchestrator                        │   │
│  │                                                                   │   │
│  │  ┌─────────────┐              ┌─────────────┐                   │   │
│  │  │   ALPHA     │◄────────────▶│   BETA      │                   │   │
│  │  │  (Research) │   Messages   │ (Synthesis) │                   │   │
│  │  │   Python    │              │    Bun      │                   │   │
│  │  └──────┬──────┘              └──────┬──────┘                   │   │
│  │         │                            │                           │   │
│  │         └─────────┬──────────────────┘                           │   │
│  │                   │                                               │   │
│  │                   ▼                                               │   │
│  │            ┌─────────────┐                                       │   │
│  │            │   GAMMA     │  (Spawned on STUCK/CONFLICT)          │   │
│  │            │ (Mediator)  │                                       │   │
│  │            └─────────────┘                                       │   │
│  │                                                                   │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                                                                          │
│  Shared Infrastructure:                                                  │
│  • Blackboard (DragonflyDB) - Proposals, solutions, consensus           │
│  • MessageBus (Redis PubSub) - Agent coordination                       │
│  • MetricsCollector - Performance tracking                              │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘
                                                            │
                                                            ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                       COMPLETION & AUDIT                                 │
│                                                                          │
│  • Results written to SQLite ledger                                     │
│  • Checkpoint created with final state                                  │
│  • WebSocket broadcast to UI                                            │
│  • Pipeline status → COMPLETED                                          │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Implementation Components

1. Auto-Orchestration Trigger

Location: /opt/agent-governance/ui/server.ts

Trigger Conditions:

  • Pipeline reaches REPORT phase
  • All agents have completed or timed out
  • No critical failures blocking continuation

New Endpoint: POST /api/pipeline/continue

{
  pipeline_id: string;
  mode: "openrouter" | "local";  // openrouter = full LLM, local = mock
  model?: string;                // Default: anthropic/claude-sonnet-4
  timeout?: number;              // Default: 120s
}

2. Parallel Agent Execution

Python Agent (ALPHA):

  • Path: /opt/agent-governance/agents/llm-planner/governed_agent.py
  • Role: Research, analysis, proposal generation
  • Runtime: Python 3.11 with venv

Bun Agent (BETA):

  • Path: /opt/agent-governance/agents/llm-planner-ts/governed-agent.ts
  • Role: Synthesis, evaluation, solution building
  • Runtime: Bun (4x faster than Node.js)

Coordination:

  • Both agents connect to same DragonflyDB instance
  • Shared Blackboard for structured data exchange
  • MessageBus for real-time communication
  • SpawnController monitors for GAMMA trigger conditions

3. OpenRouter Integration

Credential Flow:

Vault (secret/data/api-keys/openrouter)
    │
    ▼
getVaultSecret() in agent code
    │
    ▼
OpenAI client with baseURL: "https://openrouter.ai/api/v1"
    │
    ▼
Model: anthropic/claude-sonnet-4

Rate Limiting:

  • Handled by OpenRouter API
  • Circuit breaker in governance layer (5 failures → open)
  • Per-agent token budget tracking

4. Error Handling & Failover

Level 1: Agent-Level Recovery

Error Budget per agent:
- max_total_errors: 8
- max_same_error_repeats: 2
- max_procedure_violations: 1

On budget exceeded → Agent revoked, handoff created

Level 2: Pipeline-Level Recovery

On agent failure:
1. Record failure in DragonflyDB
2. Check if partner agent can continue alone
3. If both fail → Pipeline status = FAILED
4. Create checkpoint with failure details

Level 3: Orchestration-Level Recovery

On orchestration timeout (120s default):
1. Force-stop running agents
2. Collect partial results from Blackboard
3. Generate partial report
4. Pipeline status = TIMEOUT

GAMMA Spawn Conditions:

Condition Threshold Action
STUCK 30s no progress Spawn GAMMA mediator
CONFLICT 3+ unresolved proposals Spawn GAMMA to arbitrate
COMPLEXITY Score > 0.8 Spawn GAMMA for decomposition

API Endpoints

Existing (Modified)

  • POST /api/spawn - Creates pipeline, now includes auto_continue: boolean
  • GET /api/checkpoint/report - Returns report with continuation status

New

  • POST /api/pipeline/continue - Triggers OpenRouter orchestration
  • GET /api/pipeline/{id}/orchestration - Gets orchestration status
  • POST /api/pipeline/{id}/stop - Emergency stop

WebSocket Events

New Events

// Orchestration started
{ type: "orchestration_started", data: { pipeline_id, model, agents: ["ALPHA", "BETA"] } }

// Agent spawned
{ type: "agent_spawned", data: { pipeline_id, agent_id, role, runtime } }

// Agent message
{ type: "agent_message", data: { pipeline_id, from, to, content } }

// GAMMA spawned (conditional)
{ type: "gamma_spawned", data: { pipeline_id, reason: "STUCK" | "CONFLICT" | "COMPLEXITY" } }

// Consensus reached
{ type: "consensus_reached", data: { pipeline_id, proposal_id, votes } }

// Orchestration complete
{ type: "orchestration_complete", data: { pipeline_id, status, results } }

Configuration

Environment Variables

# Enable auto-orchestration after report
AUTO_ORCHESTRATE=true

# Default model for OpenRouter
OPENROUTER_MODEL=anthropic/claude-sonnet-4

# Orchestration timeout (seconds)
ORCHESTRATION_TIMEOUT=120

# GAMMA spawn thresholds
GAMMA_STUCK_THRESHOLD=30
GAMMA_CONFLICT_THRESHOLD=3
GAMMA_COMPLEXITY_THRESHOLD=0.8

Vault Secrets Required

secret/data/api-keys/openrouter
  └── api_key: "sk-or-..."

secret/data/services/dragonfly
  └── password: "..."

Implementation Steps

Step 1: Add Auto-Continue Logic to UI Server

  • Add triggerOrchestration() function
  • Modify checkPipelineCompletion() to check for auto_continue
  • Add /api/pipeline/continue endpoint

Step 2: Connect to Multi-Agent Orchestrator

  • Spawn orchestrator.ts from UI via Bun.spawn()
  • Pass pipeline context (task_id, objective, model, timeout)
  • Wire up WebSocket events (orchestration_started, agent_message, consensus_event, orchestration_complete)

Step 3: Add Orchestration Status Tracking

  • Track orchestration state in Redis (ORCHESTRATING status)
  • Add orchestration_started_at timestamp
  • Create checkpoint on completion

Step 4: Implement Error Handling

  • Add timeout handling via orchestrator --timeout flag
  • Capture exit codes and error messages
  • Set ORCHESTRATION_FAILED or ORCHESTRATION_ERROR status on failure

Step 5: Test End-to-End

  • Spawn pipeline with objective
  • Verify report generation
  • Verify auto-trigger to orchestration
  • Verify parallel agent execution
  • Verify results collection

Demonstration Results (2026-01-24)

Successfully tested with pipeline-mksufe23:

  • Pipeline spawned → ALPHA/BETA ran → Report generated → Auto-orchestration triggered
  • GAMMA spawned due to complexity (0.8 threshold)
  • Total orchestration time: 51.4 seconds
  • Final status: COMPLETED

Testing

Manual Test Command

# 1. Start UI server
cd /opt/agent-governance/ui && bun run server.ts

# 2. Spawn pipeline via API
curl -X POST http://localhost:3000/api/spawn \
  -H "Content-Type: application/json" \
  -d '{"objective": "Design a caching strategy", "auto_continue": true}'

# 3. Watch WebSocket for events
# Pipeline should: SPAWN → RUNNING → REPORT → ORCHESTRATE → COMPLETE

Validation Criteria

  • Pipeline reaches ORCHESTRATION phase automatically
  • Both ALPHA and BETA agents spawn
  • Agents communicate via MessageBus
  • Results appear in Blackboard
  • Final checkpoint created
  • Audit trail in SQLite

Rollback Plan

If orchestration fails repeatedly:

  1. Set AUTO_ORCHESTRATE=false
  2. Pipeline will stop at REPORT phase
  3. Manual intervention can trigger orchestration
  4. Review logs in /api/pipeline/logs

Document Version: 1.0 Last Updated: 2026-01-24