golangLAKEHOUSE/tests/reality/real_coord_queries.txt
root 7f2f112e6a reality_test real_001: real-shape coordinator queries — surfaces cross-role bleed
First retrieval probe with non-synthetic query distribution. Pulls
N rows from /home/profit/lakehouse/data/datasets/fill_events.parquet
(real-shape demand data) and translates each to the natural language
a coordinator would type: "Need {count} {role}s in {city} {state}
starting at {at} for {client}".

Headline: 8/10 cold-pass top-1 = judge-best on real distribution.
Substrate works on queries it was never trained for. v2-moe + workers
corpus carry the load.

Surfaced finding (the real value of running this): same-client+city
queries cluster, and Shape A's distance boost bleeds across roles
within the cluster. Q#2 (Forklift @ Beacon Freight Detroit) records
e-6193 in the playbook corpus. Q#5 (Pickers same client+city) and
Q#10 (CNC Operator same client+city) inherit e-6193 at warm top-1
even though:
- Neither query has its own recorded playbook.
- Neither warm pass triggers a Shape B inject (boosted=0).
- The roles are different staffing categories.

Q#10 specifically demoted the cold-pass-correct w-3759 (judge rating
4 at rank 0) for a worker who was approved by the judge for a
different role on a different query.

Why the lift suite missed it: synthetic queries use 7 disjoint
scenario buckets (forklift+OSHA+WI / CDL+IL / etc.). Real demand
clusters on (client, city). The cluster doesn't exist in the
synthetic distribution.

Why the judge gate doesn't catch it: the gate (5a3364f) is
per-injection at record time. After approval the worker rides Shape A
distance boosts on all later same-cluster queries with no second
gate call.

Becomes new OPEN #1. Fix candidate: role-scoped playbook corpus
metadata + Shape A boost gate on role match. Cheap; doesn't need
new judge calls.

Files:
- scripts/cutover/gen_real_queries.go: parquet → coordinator NL
- tests/reality/real_coord_queries.txt: 10 generated queries
- reports/reality-tests/playbook_lift_real_001.md: harness output
- reports/reality-tests/real_001_findings.md: the reading

Repro:
  go run scripts/cutover/gen_real_queries.go -limit 10 > tests/reality/real_coord_queries.txt
  QUERIES_FILE=tests/reality/real_coord_queries.txt RUN_ID=real_001 \
    WITH_PARAPHRASE=0 WITH_REJUDGE=0 ./scripts/playbook_lift.sh

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 20:18:40 -05:00

19 lines
1.2 KiB
Plaintext

# Real-shape coordinator queries — generated from fill_events.parquet
# (real-shape demand data; queries built mechanically from event rows).
# Source: /home/profit/lakehouse/data/datasets/fill_events.parquet (123 rows total, 10 emitted)
#
# Format: client + count + role + city/state + start time +
# (optional deadline). Mimics the natural language a coordinator would
# type into a dispatch tool when triaging the next-up demand.
Need 5 Warehouse Associates in Kansas City MO starting at 09:00 for Parallel Machining
Need 1 Forklift Operator in Detroit MI starting at 15:00 for Beacon Freight, deadline 2026-05-28
Need 4 Loaders in Indianapolis IN starting at 12:00 for Midway Distribution
Need 3 Warehouse Associates in Fort Wayne IN starting at 17:30 for Cornerstone Fabrication, deadline 2026-05-17
Need 4 Pickers in Detroit MI starting at 13:30 for Beacon Freight, deadline 2026-05-28
Need 2 Packers in Joliet IL starting at 09:30 for Parallel Machining
Need 3 Assemblers in Flint MI starting at 08:30 for Heritage Foods
Need 3 Packers in Flint MI starting at 12:30 for Parallel Machining
Need 1 Shipping Clerk in Flint MI starting at 17:00 for Pioneer Assembly
Need 1 CNC Operator in Detroit MI starting at 17:30 for Beacon Freight, deadline 2026-05-28