Decision A from reports/staffing/synthetic-data-gap-report.md §7.
Walks tests/multi-agent/scenarios/scen_*.json and
data/_playbook_lessons/*.json, normalizes to a single fill_events.parquet
at data/datasets/fill_events.parquet. One row per scenario event,
lesson outcomes joined by (client, date) where the tuple matches.
rows: 123
scenarios contributing: 40
events with outcome data: 62
unique (client, date) tuples: 40
Reproducibility: event_id is SHA1(client|date|role|at|city) truncated to
16 hex chars; rows sorted by event_id before write so re-runs produce
bit-identical output. Verified.
Pure normalization — no LLM, no new data, no distillation substrate
mutation.