First retrieval probe with non-synthetic query distribution. Pulls
N rows from /home/profit/lakehouse/data/datasets/fill_events.parquet
(real-shape demand data) and translates each to the natural language
a coordinator would type: "Need {count} {role}s in {city} {state}
starting at {at} for {client}".
Headline: 8/10 cold-pass top-1 = judge-best on real distribution.
Substrate works on queries it was never trained for. v2-moe + workers
corpus carry the load.
Surfaced finding (the real value of running this): same-client+city
queries cluster, and Shape A's distance boost bleeds across roles
within the cluster. Q#2 (Forklift @ Beacon Freight Detroit) records
e-6193 in the playbook corpus. Q#5 (Pickers same client+city) and
Q#10 (CNC Operator same client+city) inherit e-6193 at warm top-1
even though:
- Neither query has its own recorded playbook.
- Neither warm pass triggers a Shape B inject (boosted=0).
- The roles are different staffing categories.
Q#10 specifically demoted the cold-pass-correct w-3759 (judge rating
4 at rank 0) for a worker who was approved by the judge for a
different role on a different query.
Why the lift suite missed it: synthetic queries use 7 disjoint
scenario buckets (forklift+OSHA+WI / CDL+IL / etc.). Real demand
clusters on (client, city). The cluster doesn't exist in the
synthetic distribution.
Why the judge gate doesn't catch it: the gate (5a3364f) is
per-injection at record time. After approval the worker rides Shape A
distance boosts on all later same-cluster queries with no second
gate call.
Becomes new OPEN #1. Fix candidate: role-scoped playbook corpus
metadata + Shape A boost gate on role match. Cheap; doesn't need
new judge calls.
Files:
- scripts/cutover/gen_real_queries.go: parquet → coordinator NL
- tests/reality/real_coord_queries.txt: 10 generated queries
- reports/reality-tests/playbook_lift_real_001.md: harness output
- reports/reality-tests/real_001_findings.md: the reading
Repro:
go run scripts/cutover/gen_real_queries.go -limit 10 > tests/reality/real_coord_queries.txt
QUERIES_FILE=tests/reality/real_coord_queries.txt RUN_ID=real_001 \
WITH_PARAPHRASE=0 WITH_REJUDGE=0 ./scripts/playbook_lift.sh
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
19 lines
1.2 KiB
Plaintext
19 lines
1.2 KiB
Plaintext
# Real-shape coordinator queries — generated from fill_events.parquet
|
|
# (real-shape demand data; queries built mechanically from event rows).
|
|
# Source: /home/profit/lakehouse/data/datasets/fill_events.parquet (123 rows total, 10 emitted)
|
|
#
|
|
# Format: client + count + role + city/state + start time +
|
|
# (optional deadline). Mimics the natural language a coordinator would
|
|
# type into a dispatch tool when triaging the next-up demand.
|
|
|
|
Need 5 Warehouse Associates in Kansas City MO starting at 09:00 for Parallel Machining
|
|
Need 1 Forklift Operator in Detroit MI starting at 15:00 for Beacon Freight, deadline 2026-05-28
|
|
Need 4 Loaders in Indianapolis IN starting at 12:00 for Midway Distribution
|
|
Need 3 Warehouse Associates in Fort Wayne IN starting at 17:30 for Cornerstone Fabrication, deadline 2026-05-17
|
|
Need 4 Pickers in Detroit MI starting at 13:30 for Beacon Freight, deadline 2026-05-28
|
|
Need 2 Packers in Joliet IL starting at 09:30 for Parallel Machining
|
|
Need 3 Assemblers in Flint MI starting at 08:30 for Heritage Foods
|
|
Need 3 Packers in Flint MI starting at 12:30 for Parallel Machining
|
|
Need 1 Shipping Clerk in Flint MI starting at 17:00 for Pioneer Assembly
|
|
Need 1 CNC Operator in Detroit MI starting at 17:30 for Beacon Freight, deadline 2026-05-28
|