Phase 8 Production Hardening with complete governance infrastructure: - Vault integration with tiered policies (T0-T4) - DragonflyDB state management - SQLite audit ledger - Pipeline DSL and templates - Promotion/revocation engine - Checkpoint system for session persistence - Health manager and circuit breaker for fault tolerance - GitHub/Slack integrations - Architectural test pipeline with bug watcher, suggestion engine, council review - Multi-agent chaos testing framework Test Results: - Governance tests: 68/68 passing - E2E workflow: 16/16 passing - Phase 2 Vault: 14/14 passing - Integration tests: 27/27 passing Coverage: 57.6% average across 12 phases Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
307 lines
7.4 KiB
Markdown
307 lines
7.4 KiB
Markdown
# Directory-Level README/STATUS Protocol
|
|
|
|
> Guidelines for maintaining documentation and status tracking across the Agent Governance System.
|
|
|
|
## Overview
|
|
|
|
Every significant subdirectory in the project should contain two files:
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `README.md` | Overview, purpose, key files, architecture references |
|
|
| `STATUS.md` | Ongoing status log with phase, tasks, timestamps, dependencies |
|
|
|
|
This protocol ensures:
|
|
- Consistent documentation across the codebase
|
|
- Clear status visibility for humans and agents
|
|
- Traceable progress through the project lifecycle
|
|
|
|
## File Templates
|
|
|
|
### README.md Structure
|
|
|
|
```markdown
|
|
# Directory Name
|
|
|
|
> One-line purpose description
|
|
|
|
## Overview
|
|
Brief explanation of what this directory contains.
|
|
|
|
## Key Files
|
|
| File | Description |
|
|
|------|-------------|
|
|
| `main.py` | Primary module |
|
|
| `config.yaml` | Configuration |
|
|
|
|
## Interfaces / APIs
|
|
Document any CLI commands, APIs, or interfaces.
|
|
|
|
## Status
|
|
**✅ COMPLETE** (or 🚧 IN_PROGRESS, ❗ BLOCKED, etc.)
|
|
|
|
See [STATUS.md](./STATUS.md) for detailed progress.
|
|
|
|
## Architecture Reference
|
|
Part of the [Agent Governance System](/opt/agent-governance/docs/ARCHITECTURE.md).
|
|
Parent: [Parent Directory](../)
|
|
```
|
|
|
|
### STATUS.md Structure
|
|
|
|
```markdown
|
|
# Status: Directory Name
|
|
|
|
## Current Phase
|
|
**🚧 IN_PROGRESS**
|
|
|
|
## Tasks
|
|
| Status | Task | Updated |
|
|
|--------|------|---------|
|
|
| ✓ | Initial implementation | 2026-01-20 |
|
|
| ☐ | Add tests | - |
|
|
|
|
## Dependencies
|
|
- Requires `pipeline/core.py`
|
|
- Depends on DragonflyDB service
|
|
|
|
## Issues / Blockers
|
|
- Waiting on schema finalization
|
|
|
|
## Activity Log
|
|
### 2026-01-23 10:30:00 UTC
|
|
- **Phase**: IN_PROGRESS
|
|
- **Action**: Added chaos testing
|
|
- **Details**: Implemented error injection framework
|
|
```
|
|
|
|
## Status Phases
|
|
|
|
| Phase | Icon | Description |
|
|
|-------|------|-------------|
|
|
| `complete` | ✅ | Work is finished and verified |
|
|
| `in_progress` | 🚧 | Active development underway |
|
|
| `blocked` | ❗ | Waiting on external dependencies |
|
|
| `needs_review` | ⚠️ | Requires attention or review |
|
|
| `not_started` | ⬜ | No work has begun |
|
|
|
|
## CLI Commands
|
|
|
|
The `status` command provides management tools:
|
|
|
|
```bash
|
|
# Check all directories for missing/outdated files
|
|
status sweep
|
|
|
|
# Auto-create missing README/STATUS files
|
|
status sweep --fix
|
|
|
|
# Update a directory's status
|
|
status update ./pipeline --phase complete
|
|
status update ./tests --phase in_progress --task "Add integration tests"
|
|
|
|
# Initialize files for a new directory
|
|
status init ./new-module
|
|
|
|
# Show project-wide dashboard
|
|
status dashboard
|
|
|
|
# View templates
|
|
status template readme
|
|
status template status
|
|
```
|
|
|
|
## Agent Workflow Integration
|
|
|
|
### When Entering a Directory
|
|
|
|
Agents should:
|
|
1. Read `README.md` to understand the directory's purpose
|
|
2. Read `STATUS.md` to see current progress and any blockers
|
|
3. Check dependencies before starting work
|
|
|
|
### While Working
|
|
|
|
Agents should:
|
|
1. Update `STATUS.md` when starting a significant task
|
|
2. Log activity entries for notable changes
|
|
3. Update task checkboxes as work completes
|
|
|
|
### Before Leaving a Directory
|
|
|
|
Agents should:
|
|
1. Update `STATUS.md` with final state
|
|
2. Mark completed tasks as done
|
|
3. Add any new issues discovered
|
|
4. Update the timestamp
|
|
|
|
### Example Agent Workflow
|
|
|
|
```python
|
|
# At start of work
|
|
async def enter_directory(dir_path: str):
|
|
# Read context
|
|
readme = await read_file(f"{dir_path}/README.md")
|
|
status = await read_file(f"{dir_path}/STATUS.md")
|
|
|
|
# Log entry
|
|
logger.info(f"Entering {dir_path}, phase: {parse_phase(status)}")
|
|
|
|
# During work
|
|
async def log_progress(dir_path: str, task: str):
|
|
await run_command(f"status update {dir_path} --task '{task}'")
|
|
|
|
# At end of work
|
|
async def exit_directory(dir_path: str, phase: str):
|
|
await run_command(f"status update {dir_path} --phase {phase}")
|
|
```
|
|
|
|
## Diagnostics
|
|
|
|
### Finding Incomplete Directories
|
|
|
|
```bash
|
|
# Show all non-complete directories
|
|
status dashboard
|
|
|
|
# Example output:
|
|
# ❗ BLOCKED:
|
|
# integrations/slack (updated 3d ago)
|
|
# 🚧 IN_PROGRESS:
|
|
# tests/multi-agent-chaos (updated today)
|
|
# pipeline (updated 1d ago)
|
|
```
|
|
|
|
### Automated Sweeps
|
|
|
|
Run regular sweeps to catch missing documentation:
|
|
|
|
```bash
|
|
# Quick check
|
|
status sweep
|
|
|
|
# Auto-fix missing files
|
|
status sweep --fix
|
|
```
|
|
|
|
## Checkpoint Integration
|
|
|
|
The status system is integrated with the checkpoint system for unified state management.
|
|
|
|
### How They Work Together
|
|
|
|
1. **When `checkpoint now` runs**: The checkpoint captures an aggregate snapshot of all directory statuses, including:
|
|
- Status summary (counts by phase)
|
|
- List of all directories with their current phase
|
|
- Timestamps and task counts
|
|
|
|
2. **When `status update` runs**: A lightweight checkpoint delta is automatically created to record the change (unless `--no-checkpoint` is specified).
|
|
|
|
3. **Unified reporting**: Use `checkpoint report` to see both checkpoint metadata and directory statuses side-by-side.
|
|
|
|
### Key Commands
|
|
|
|
```bash
|
|
# Create checkpoint with directory status snapshot
|
|
checkpoint now --notes "Completed pipeline work"
|
|
|
|
# View combined checkpoint + status report
|
|
checkpoint report
|
|
|
|
# View timeline of checkpoints with status changes
|
|
checkpoint timeline
|
|
|
|
# Update status (auto-creates checkpoint delta)
|
|
status update ./pipeline --phase complete
|
|
|
|
# Update status without checkpoint
|
|
status update ./tests --phase in_progress --no-checkpoint
|
|
```
|
|
|
|
### Recovery After Reset
|
|
|
|
After a context reset or session restart:
|
|
|
|
1. **Load the latest checkpoint**:
|
|
```bash
|
|
checkpoint load
|
|
```
|
|
|
|
2. **Inspect per-directory statuses**:
|
|
```bash
|
|
checkpoint report
|
|
# or
|
|
status dashboard
|
|
```
|
|
|
|
3. **Resume work** by finding active directories:
|
|
```bash
|
|
checkpoint report --phase in_progress
|
|
```
|
|
|
|
4. **Read local STATUS.md** in the target directory for task-level context.
|
|
|
|
### Example: Full Recovery Workflow
|
|
|
|
```bash
|
|
# 1. Load checkpoint to get context
|
|
checkpoint load
|
|
# Shows: Phase, Dependencies, Status Summary
|
|
|
|
# 2. See what's active
|
|
checkpoint report --phase in_progress
|
|
# Lists: Active directories with their progress
|
|
|
|
# 3. Pick up work in a directory
|
|
cd /opt/agent-governance/pipeline
|
|
cat STATUS.md
|
|
# Read activity log, pending tasks
|
|
|
|
# 4. Continue work and update status
|
|
status update . --task "Resumed: adding validation" --phase in_progress
|
|
# Auto-creates checkpoint delta
|
|
|
|
# 5. When done, mark complete
|
|
status update . --phase complete
|
|
# Creates final checkpoint
|
|
```
|
|
|
|
## Directory Exclusions
|
|
|
|
The following directories are excluded from status tracking:
|
|
|
|
- `__pycache__` - Python cache
|
|
- `node_modules` - Node.js dependencies
|
|
- `.git` - Version control
|
|
- `logs` - Runtime logs
|
|
- `storage` - Generated data
|
|
- `credentials` - Sensitive data
|
|
- `workspace` - Agent workspaces
|
|
- `dragonfly-data` - Database files
|
|
|
|
## Best Practices
|
|
|
|
1. **Keep README.md Stable**: Update only when structure changes
|
|
2. **Update STATUS.md Frequently**: Log all significant activity
|
|
3. **Use Clear Task Descriptions**: Be specific about what needs doing
|
|
4. **Document Blockers**: Help others understand dependencies
|
|
5. **Timestamp Everything**: Enable tracking of staleness
|
|
6. **Link to Architecture**: Reference parent docs for context
|
|
|
|
## Integration with Checkpoints
|
|
|
|
The status system integrates with the checkpoint system:
|
|
|
|
```bash
|
|
# After major changes, create checkpoint
|
|
checkpoint now --notes "Updated pipeline status to complete"
|
|
|
|
# Status can inform checkpoint decisions
|
|
status dashboard # See what's in progress
|
|
checkpoint now # Save current state
|
|
```
|
|
|
|
---
|
|
|
|
*Part of the [Agent Governance System](/opt/agent-governance/docs/ARCHITECTURE.md)*
|