# Playbook lift reality test — staffing query corpus. # # Each non-blank, non-comment line is one query. The harness will run # each through matrix.search (cold pass, then warm pass with playbook), # ask the LLM judge to rate top-K results, and record lift metrics. # # Lift only fires when the judge picks something different from cosine # top-1, so queries are weighted toward multi-constraint asks where # cosine has to compromise. Single-axis queries ("forklift operator") # give cosine an easy win and the harness can't tell if the playbook # is doing anything. # # 21 queries, 7 categories × 3 each (OOD = 2 + 1 buffer). # --- Multi-constraint role + cert + geo (3) --- Forklift operator with OSHA-30, warehouse experience, day shift availability OSHA-30 certified forklift operator in Wisconsin, cold storage experience, day shift only Production worker with confined-space cert and hazmat training, Indianapolis area # --- Cert-discriminator (cosine confuses lookalikes) (3) --- CDL Class A driver, clean record, willing to do regional 4-day routes Warehouse lead with current OSHA-30 certification, NOT OSHA-10, team management experience Forklift-certified loader, certification must be active, distinct from general warehouse staff # --- Skill-intersection (multi-tag must all be present) (3) --- Hazmat-certified warehouse worker comfortable with cold storage operations Bilingual production worker with team-lead experience and training delivery skills Inventory specialist with confined-space cert and compliance background # --- Adjacent-role ambiguity (judge can pick better fit) (3) --- Warehouse worker who can run inventory cycles and lead a small team Production line worker comfortable filling in as line supervisor when needed Customer service rep willing to cross-train into dispatch or scheduling # --- Soft-attribute + role (uses reliability/availability/engagement scores) (3) --- Reliable production line lead with strong attendance and lean manufacturing background Highly responsive forklift operator available for last-minute shift coverage Engaged warehouse associate with strong safety compliance record # --- Geographic specificity (multi-state, regional preference) (3) --- CDL-A driver based in IL or WI, willing to run regional 4-day routes Bilingual customer service rep in Indianapolis or Cincinnati metro, Spanish and English Production supervisor open to Midwest relocation for permanent role # --- OOD honesty signal (system should return low-confidence, not bogus matches) (3) --- Dental hygienist with three years experience, Indianapolis area Registered nurse with ICU experience, willing to take per-diem shifts Software engineer with React and TypeScript, three years experience