6 Commits

Author SHA1 Message Date
profit
8c6e7831e9 Add Phase 10-12 implementation: multi-tenant, marketplace, observability
Major additions:
- marketplace/: Agent template registry with FTS5 search, ratings, versioning
- observability/: Prometheus metrics, distributed tracing, structured logging
- ledger/migrations/: Database migration scripts for multi-tenant support
- tests/governance/: 15 new test files for phases 6-12 (295 total tests)
- bin/validate-phases: Full 12-phase validation script

New features:
- Multi-tenant support with tenant isolation and quota enforcement
- Agent marketplace with semantic versioning and search
- Observability with metrics, tracing, and log correlation
- Tier-1 agent bootstrap scripts

Updated components:
- ledger/api.py: Extended API for tenants, marketplace, observability
- ledger/schema.sql: Added tenant, project, marketplace tables
- testing/framework.ts: Enhanced test framework
- checkpoint/checkpoint.py: Improved checkpoint management

Archived:
- External integrations (Slack/GitHub/PagerDuty) moved to .archive/
- Old checkpoint files cleaned up

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:39:47 -05:00
profit
4b5b3b0a2d Overhaul README.md to reflect Phase 12 complete architecture
Major updates:
- Architecture diagram: multi-agent pipeline with ALPHA/BETA/GAMMA agents
- Phase status: all 12 phases complete (295/295 tests)
- Added Vault token lifecycle documentation
- Added consensus failure handling workflows
- Added CLI reference: bugs, checkpoint, status, memory
- Added API endpoints documentation
- Added production constraints and revocation triggers
- Cross-linked to MULTI_AGENT_PIPELINE_ARCHITECTURE.md
- Created README_REFRESH_PLAN.md for tracking

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:31:26 -05:00
profit
09be7eff4b Add consensus failure handling with fallback options for multi-agent pipelines
Implements detection and recovery for when agents fail to reach consensus:
- Orchestrator exits with code 2 on consensus failure (distinct from error=1)
- Records failed run context (proposals, agent states, conflicts) to Dragonfly
- Provides fallback options: rerun same, rerun with GAMMA, escalate tier, accept partial
- Adds UI alert with action buttons for user-driven recovery
- Adds failure details modal and downloadable failure report
- Only marks pipeline complete when consensus achieved or user accepts fallback

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:24:19 -05:00
profit
8561d13728 Add Vault token management and observability integration for multi-agent pipelines
- Vault token issuance per pipeline with 2-hour TTL
- Automatic token renewal loop every 30 minutes
- Error budget tracking with threshold-based revocation
- Observability-driven token revocation for policy violations
- Diagnostic pipeline spawning on error threshold breach
- Structured handoff reports for error recovery
- Agent lifecycle status API
- New API endpoints: /api/pipeline/token, /api/pipeline/errors,
  /api/observability/handoff, /api/observability/diagnostic

Orchestrator now reports errors to parent pipeline's observability
system via PIPELINE_ID environment variable.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 17:45:20 -05:00
profit
0e6d113571 Add git sync verification report
Documents comparison between cloned repo and working tree.
Confirms all 338 files are properly synchronized.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 22:10:53 -05:00
profit
77655c298c Initial commit: Agent Governance System Phase 8
Phase 8 Production Hardening with complete governance infrastructure:

- Vault integration with tiered policies (T0-T4)
- DragonflyDB state management
- SQLite audit ledger
- Pipeline DSL and templates
- Promotion/revocation engine
- Checkpoint system for session persistence
- Health manager and circuit breaker for fault tolerance
- GitHub/Slack integrations
- Architectural test pipeline with bug watcher, suggestion engine, council review
- Multi-agent chaos testing framework

Test Results:
- Governance tests: 68/68 passing
- E2E workflow: 16/16 passing
- Phase 2 Vault: 14/14 passing
- Integration tests: 27/27 passing

Coverage: 57.6% average across 12 phases

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 22:07:06 -05:00