profit 8c6e7831e9 Add Phase 10-12 implementation: multi-tenant, marketplace, observability

Major additions:
- marketplace/: Agent template registry with FTS5 search, ratings, versioning
- observability/: Prometheus metrics, distributed tracing, structured logging
- ledger/migrations/: Database migration scripts for multi-tenant support
- tests/governance/: 15 new test files for phases 6-12 (295 total tests)
- bin/validate-phases: Full 12-phase validation script

New features:
- Multi-tenant support with tenant isolation and quota enforcement
- Agent marketplace with semantic versioning and search
- Observability with metrics, tracing, and log correlation
- Tier-1 agent bootstrap scripts

Updated components:
- ledger/api.py: Extended API for tenants, marketplace, observability
- ledger/schema.sql: Added tenant, project, marketplace tables
- testing/framework.ts: Enhanced test framework
- checkpoint/checkpoint.py: Improved checkpoint management

Archived:
- External integrations (Slack/GitHub/PagerDuty) moved to .archive/
- Old checkpoint files cleaned up

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-24 18:39:47 -05:00

12 KiB

Raw Blame History

Production Pipeline: Report → OpenRouter Orchestration

Overview

This document describes the automatic transition from the UI "view report" stage into the live multi-agent pipeline, including OpenRouter-driven parallel execution.

Created: 2026-01-24 Status: Implemented

Architecture Flow

┌─────────────────────────────────────────────────────────────────────────┐
│                         UI DASHBOARD                                     │
│                                                                          │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────────────┐  │
│  │  SPAWN   │───▶│ RUNNING  │───▶│  REPORT  │───▶│ AUTO-ORCHESTRATE │  │
│  │ Pipeline │    │ Agents   │    │  Stage   │    │   (NEW)          │  │
│  └──────────┘    └──────────┘    └──────────┘    └────────┬─────────┘  │
│                                                           │             │
└───────────────────────────────────────────────────────────┼─────────────┘
                                                            │
                                                            ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                    OPENROUTER ORCHESTRATION                              │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │                    MultiAgentOrchestrator                        │   │
│  │                                                                   │   │
│  │  ┌─────────────┐              ┌─────────────┐                   │   │
│  │  │   ALPHA     │◄────────────▶│   BETA      │                   │   │
│  │  │  (Research) │   Messages   │ (Synthesis) │                   │   │
│  │  │   Python    │              │    Bun      │                   │   │
│  │  └──────┬──────┘              └──────┬──────┘                   │   │
│  │         │                            │                           │   │
│  │         └─────────┬──────────────────┘                           │   │
│  │                   │                                               │   │
│  │                   ▼                                               │   │
│  │            ┌─────────────┐                                       │   │
│  │            │   GAMMA     │  (Spawned on STUCK/CONFLICT)          │   │
│  │            │ (Mediator)  │                                       │   │
│  │            └─────────────┘                                       │   │
│  │                                                                   │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                                                                          │
│  Shared Infrastructure:                                                  │
│  • Blackboard (DragonflyDB) - Proposals, solutions, consensus           │
│  • MessageBus (Redis PubSub) - Agent coordination                       │
│  • MetricsCollector - Performance tracking                              │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘
                                                            │
                                                            ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                       COMPLETION & AUDIT                                 │
│                                                                          │
│  • Results written to SQLite ledger                                     │
│  • Checkpoint created with final state                                  │
│  • WebSocket broadcast to UI                                            │
│  • Pipeline status → COMPLETED                                          │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Implementation Components

1. Auto-Orchestration Trigger

Location: /opt/agent-governance/ui/server.ts

Trigger Conditions:

Pipeline reaches REPORT phase
All agents have completed or timed out
No critical failures blocking continuation

New Endpoint: POST /api/pipeline/continue

{
  pipeline_id: string;
  mode: "openrouter" | "local";  // openrouter = full LLM, local = mock
  model?: string;                // Default: anthropic/claude-sonnet-4
  timeout?: number;              // Default: 120s
}

2. Parallel Agent Execution

Python Agent (ALPHA):

Path: /opt/agent-governance/agents/llm-planner/governed_agent.py
Role: Research, analysis, proposal generation
Runtime: Python 3.11 with venv

Bun Agent (BETA):

Path: /opt/agent-governance/agents/llm-planner-ts/governed-agent.ts
Role: Synthesis, evaluation, solution building
Runtime: Bun (4x faster than Node.js)

Coordination:

Both agents connect to same DragonflyDB instance
Shared Blackboard for structured data exchange
MessageBus for real-time communication
SpawnController monitors for GAMMA trigger conditions

3. OpenRouter Integration

Credential Flow:

Vault (secret/data/api-keys/openrouter)
    │
    ▼
getVaultSecret() in agent code
    │
    ▼
OpenAI client with baseURL: "https://openrouter.ai/api/v1"
    │
    ▼
Model: anthropic/claude-sonnet-4

Rate Limiting:

Handled by OpenRouter API
Circuit breaker in governance layer (5 failures → open)
Per-agent token budget tracking

4. Error Handling & Failover

Level 1: Agent-Level Recovery

Error Budget per agent:
- max_total_errors: 8
- max_same_error_repeats: 2
- max_procedure_violations: 1

On budget exceeded → Agent revoked, handoff created

Level 2: Pipeline-Level Recovery

On agent failure:
1. Record failure in DragonflyDB
2. Check if partner agent can continue alone
3. If both fail → Pipeline status = FAILED
4. Create checkpoint with failure details

Level 3: Orchestration-Level Recovery

On orchestration timeout (120s default):
1. Force-stop running agents
2. Collect partial results from Blackboard
3. Generate partial report
4. Pipeline status = TIMEOUT

GAMMA Spawn Conditions:

Condition	Threshold	Action
STUCK	30s no progress	Spawn GAMMA mediator
CONFLICT	3+ unresolved proposals	Spawn GAMMA to arbitrate
COMPLEXITY	Score > 0.8	Spawn GAMMA for decomposition

API Endpoints

Existing (Modified)

POST /api/spawn - Creates pipeline, now includes auto_continue: boolean
GET /api/checkpoint/report - Returns report with continuation status

New

POST /api/pipeline/continue - Triggers OpenRouter orchestration
GET /api/pipeline/{id}/orchestration - Gets orchestration status
POST /api/pipeline/{id}/stop - Emergency stop

WebSocket Events

New Events

// Orchestration started
{ type: "orchestration_started", data: { pipeline_id, model, agents: ["ALPHA", "BETA"] } }

// Agent spawned
{ type: "agent_spawned", data: { pipeline_id, agent_id, role, runtime } }

// Agent message
{ type: "agent_message", data: { pipeline_id, from, to, content } }

// GAMMA spawned (conditional)
{ type: "gamma_spawned", data: { pipeline_id, reason: "STUCK" | "CONFLICT" | "COMPLEXITY" } }

// Consensus reached
{ type: "consensus_reached", data: { pipeline_id, proposal_id, votes } }

// Orchestration complete
{ type: "orchestration_complete", data: { pipeline_id, status, results } }

Configuration

Environment Variables

# Enable auto-orchestration after report
AUTO_ORCHESTRATE=true

# Default model for OpenRouter
OPENROUTER_MODEL=anthropic/claude-sonnet-4

# Orchestration timeout (seconds)
ORCHESTRATION_TIMEOUT=120

# GAMMA spawn thresholds
GAMMA_STUCK_THRESHOLD=30
GAMMA_CONFLICT_THRESHOLD=3
GAMMA_COMPLEXITY_THRESHOLD=0.8

Vault Secrets Required

secret/data/api-keys/openrouter
  └── api_key: "sk-or-..."

secret/data/services/dragonfly
  └── password: "..."

Implementation Steps

Step 1: Add Auto-Continue Logic to UI Server

Add triggerOrchestration() function
Modify checkPipelineCompletion() to check for auto_continue
Add /api/pipeline/continue endpoint

Step 2: Connect to Multi-Agent Orchestrator

Spawn orchestrator.ts from UI via Bun.spawn()
Pass pipeline context (task_id, objective, model, timeout)
Wire up WebSocket events (orchestration_started, agent_message, consensus_event, orchestration_complete)

Step 3: Add Orchestration Status Tracking

Track orchestration state in Redis (ORCHESTRATING status)
Add orchestration_started_at timestamp
Create checkpoint on completion

Step 4: Implement Error Handling

Add timeout handling via orchestrator --timeout flag
Capture exit codes and error messages
Set ORCHESTRATION_FAILED or ORCHESTRATION_ERROR status on failure

Step 5: Test End-to-End

Spawn pipeline with objective
Verify report generation
Verify auto-trigger to orchestration
Verify parallel agent execution
Verify results collection

Demonstration Results (2026-01-24)

Successfully tested with pipeline-mksufe23:

Pipeline spawned → ALPHA/BETA ran → Report generated → Auto-orchestration triggered
GAMMA spawned due to complexity (0.8 threshold)
Total orchestration time: 51.4 seconds
Final status: COMPLETED

Testing

Manual Test Command

# 1. Start UI server
cd /opt/agent-governance/ui && bun run server.ts

# 2. Spawn pipeline via API
curl -X POST http://localhost:3000/api/spawn \
  -H "Content-Type: application/json" \
  -d '{"objective": "Design a caching strategy", "auto_continue": true}'

# 3. Watch WebSocket for events
# Pipeline should: SPAWN → RUNNING → REPORT → ORCHESTRATE → COMPLETE

Validation Criteria

Pipeline reaches ORCHESTRATION phase automatically
Both ALPHA and BETA agents spawn
Agents communicate via MessageBus
Results appear in Blackboard
Final checkpoint created
Audit trail in SQLite

Rollback Plan

If orchestration fails repeatedly:

Set AUTO_ORCHESTRATE=false
Pipeline will stop at REPORT phase
Manual intervention can trigger orchestration
Review logs in /api/pipeline/logs

Document Version: 1.0 Last Updated: 2026-01-24

12 KiB Raw Blame History