root 41b0a99ed2 chore: add real content that was sitting untracked
Surfaced by today's untracked-files audit. None of these are accidents —
multiple are referenced by name in CLAUDE.md and memory files but were
never added.

Categories:
- docs/PHASE_AUDIT_GUIDE.md (106 LOC) — Claude Code phase audit guidance
- ops/systemd/lakehouse-langfuse-bridge.service — Langfuse bridge unit
- package.json — top-level npm manifest
- scripts/e2e_pipeline_check.sh + production_smoke.sh — real test scripts
- reports/kimi/audit-last-week*.md — the "Two reports live" CLAUDE.md cites
- tests/multi-agent/scenarios/ — 44 staffing scenarios (cutover decision A)
- tests/multi-agent/playbooks/ — 102 playbook records
- tests/battery/, tests/agent_test/PRD.md, tests/real-world/* — real tests
- sidecar/sidecar/{lab_ui,pipeline_lab}.py — 888 LOC dev-only UIs that
  remain in service post-sidecar-drop (commit ba928b1 explicitly kept them)

Sensitivity check: scenarios use synthetic company names ("Heritage Foods",
"Cornerstone Fabrication"); audit reports describe code findings only;
no PII or secrets surfaced.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 22:22:10 -05:00

45 lines
2.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Scenario retrospective — Riverfront Steel, 2026-04-21
Executor: `mistral:latest` Reviewer: `qwen2.5:latest` Draft: `qwen2.5:latest`
## Events
| At | Kind | Role / Count | Pool | Fills | Turns | Dur(s) | Cites | Gaps |
|---|---|---|---|---|---|---|---|---|
| 08:00 | baseline_fill | Warehouse Associate × 3 | - | ✗ 0 | 0 | 28.9 | 0 | 1 |
| 10:30 | recurring | Machine Operator × 2 | 997 | ✓ 2 | 5 | 12.1 | 0 | 1 |
| 12:15 | expansion | Forklift Operator × 5 | - | ✗ 0 | 0 | 20.3 | 0 | 1 |
| 14:00 | emergency | Loader × 4 | - | ✗ 0 | 0 | 35.7 | 0 | 1 |
| 15:45 | misplacement | Warehouse Associate × 1 | - | ✗ 0 | 0 | 11.5 | 0 | 1 |
## Final roster
| Worker | Booked | Role | City, ST | Status |
|---|---|---|---|---|
| undefined Raymond G. Ward | 10:30 | Machine Operator | Toledo, OH | confirmed |
| undefined Pamela V. Green | 10:30 | Machine Operator | Toledo, OH | confirmed |
## Gap signals
### drift_or_tool
- **08:00** — invalid JSON from executor: JSON Parse error: Unable to parse JSON string | raw: {"kind":"plan","steps":["Verify one candidate from the current list using sql tool for SQL verification.","Propose_done with top 3 candidates who are Warehouse Associates in Toledo, OH."]}
{"kind":"tool_call","tool":"sql","args":{"query":"SELECT worker_id, name, role, city, state, availability FROM
- **12:15** — invalid JSON from executor: JSON Parse error: Expected ']' | raw: {"kind":"plan","steps":["1. Use hybrid_search to find Forklift Operators in Toledo, OH with high engagement and communications scores who are available for work at Riverfront Steel starting at 01:00 PM. The search should prioritize workers with team/collaboration signals (engagement, communications
- **14:00** — no consensus after 14 turns
- **15:45** — invalid JSON from executor: JSON Parse error: Expected '}' | raw: {"kind":"tool_call","tool":"hybrid_search", "args":{"index_name":"workers_500k_v1","sql_filter":"LOWER(role) LIKE '%warehouse%' AND city = 'Toledo' AND state = 'OH' AND availability > 0.5 AND shift = '08:00' AND worker_id NOT IN [, ] AND worker_id NOT IN ["EXCLUDE_WORKERS_ID1", "EXCLUDE_WORKERS_ID2"
### double_book
- **10:30** — undefined Pamela V. Green already booked for 10:30
### fairness
- _cross-event_ — Raymond G. Ward (undefined) booked 2 times today
### write_through_audit
- _post-run_ — playbook_memory has 33 entries (ran 5 events, expected ≥ 1 new entries from this run)
## Narrative
- 1/5 events reached consensus.
- Final roster: 2 bookings across 1 distinct workers.
- Playbook citations across the day: 0 (proof the feedback loop fired across events).
- Dropped events: 08:00 baseline_fill, 12:15 expansion, 14:00 emergency, 15:45 misplacement.