Expand sample prompts to 5 per tier across all 21 modes (315 total)

Each mode now has {basic: [...], mid: [...], advanced: [...]} with 5
prompts per difficulty level. Renderer picks one random prompt from
each tier on every mode switch, so users see fresh examples each time.

315 hand-crafted prompts designed to highlight each mode's strengths:
- brainstorm: creative problem-solving at increasing scale
- pipeline: multi-step transformations from simple to complex
- debate: ethical dilemmas with escalating nuance
- validator: common myths to complex historical misconceptions
- roundrobin: writing tasks that benefit from iterative refinement
- redteam: security vulnerabilities from obvious to systemic
- consensus: opinion questions from clear to deeply contested
- codereview: coding tasks from functions to distributed systems
- ladder: concepts that scale from kindergarten to PhD
- tournament: creative competitions from one-liners to algorithms
- evolution: optimization targets from names to city infrastructure
- blindassembly: decomposable projects from explanations to systems
- staircase: progressive constraints from party planning to treaties
- drift: factual claims from simple dates to complex event sequences
- mesh: stakeholder analysis from office policies to life-or-death
- hallucination: fact-checkable claims from simple to obscure
- timeloop: cascading failures from restaurants to civilization
- research: deep dives from single topics to geopolitical analysis
- eval: benchmark prompts from trivia to formal proofs
- extract: structured extraction from sentences to legal documents
- refine: documents from product blurbs to architecture specs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
root 2026-03-29 05:22:35 -05:00
parent 0d09bb5293
commit c2cc211f21

View File

@ -2314,126 +2314,430 @@ const MODE_DESCS = {
}; };
const SAMPLE_PROMPTS = { const SAMPLE_PROMPTS = {
brainstorm: [ brainstorm: { basic: [
'What are practical ways a small town could become energy independent within 10 years?', 'What are practical ways a small town could become energy independent within 10 years?',
'How could a public library reinvent itself to stay relevant for the next 20 years?',
'What are five creative ways to reduce food waste in a college dining hall?',
'How can a neighborhood reduce package theft without cameras or confrontation?',
'What are unconventional ways to make a long commute productive and enjoyable?'
], mid: [
'Design a mentorship program that pairs retired professionals with first-generation college students — cover matching criteria, structure, and how to measure success.', 'Design a mentorship program that pairs retired professionals with first-generation college students — cover matching criteria, structure, and how to measure success.',
'A hospital wants to reduce ER wait times by 40% without hiring more staff. Propose a comprehensive strategy covering triage redesign, technology, patient flow, and communication.' 'A mid-size company is losing talent to remote-first competitors. Propose creative retention strategies beyond just salary increases.',
], 'How could a city redesign its public spaces to be equally useful in a heat wave and a blizzard?',
pipeline: [ 'Propose a system for a restaurant chain to reduce food waste by 50% while increasing customer satisfaction.',
'Design a community program that helps elderly residents adopt smart home technology without frustration or privacy concerns.'
], advanced: [
'A hospital wants to reduce ER wait times by 40% without hiring more staff. Propose a comprehensive strategy covering triage redesign, technology, patient flow, and communication.',
'Design a universal basic services program for a city of 500K. Cover housing, transit, internet, and food — with funding model, phasing, and political feasibility.',
'A developing nation wants to leapfrog traditional banking infrastructure. Design a complete financial inclusion strategy covering mobile money, identity, credit scoring, and regulation.',
'Propose a system to coordinate disaster relief across 15 NGOs with overlapping mandates, different data systems, and competing donor priorities.',
'Design an education system from scratch for a Mars colony of 10,000 people — consider demographics, resource constraints, knowledge preservation, and the 20-minute communication delay with Earth.'
]},
pipeline: { basic: [
'Write a short fable about a fox who learns patience, then translate it to Spanish, then analyze the cultural differences in how the moral lands.', 'Write a short fable about a fox who learns patience, then translate it to Spanish, then analyze the cultural differences in how the moral lands.',
'Take this business idea — "AI-powered meal planning for people with multiple food allergies" — and first do market analysis, then write a pitch deck outline, then draft the cold email to investors.', 'Describe the water cycle for a 5th grader, then rewrite it as a poem, then turn the poem into a lesson plan with quiz questions.',
'Research the history of cryptography, identify the 3 most pivotal breakthroughs, explain how each one would have changed the outcome of a specific historical conflict, then write a short alternate-history scenario for the most dramatic one.' 'Explain how a car engine works, then simplify it for a 10-year-old, then create a quiz to test understanding.',
], 'Write a product description for noise-canceling headphones, then rewrite it as a tweet, then as a haiku.',
debate: [ 'Summarize World War I in 3 paragraphs, then extract the 5 key turning points, then write a "what if" scenario for the most impactful one.'
], mid: [
'Take the concept of "digital minimalism" — first define it clearly, then argue for it, then argue against it, then write a balanced guide.',
'Take this business idea — "AI-powered meal planning for food allergies" — do market analysis, then pitch deck outline, then cold email to investors.',
'Write a technical blog post about WebSockets, then create a code tutorial, then write a FAQ for common issues.',
'Analyze the pros and cons of remote work, then draft a company policy, then write the all-hands announcement email.',
'Research the gig economy, identify the top 3 problems workers face, propose solutions, then draft legislation addressing them.'
], advanced: [
'Research the history of cryptography, identify 3 pivotal breakthroughs, explain how each would change a historical conflict, then write an alternate-history scenario.',
'Analyze a failing SaaS business. Diagnose the top 3 problems from the metrics, propose fixes, model the financial impact, then write the board presentation.',
'Take a complex legal case — "should AI-generated art be copyrightable?" — research precedents, argue both sides, draft a proposed legal framework, then write the dissenting opinion.',
'Analyze climate change data for a specific region, model economic impacts on agriculture, propose adaptation strategies, then write policy recommendations for local government.',
'Study the decline of a specific industry, extract patterns, apply them to predict which current industries are vulnerable, then write an investment thesis.'
]},
debate: { basic: [
'Should cities ban cars from downtown areas?', 'Should cities ban cars from downtown areas?',
'Is it more ethical for AI companies to open-source their models or keep them proprietary? Consider safety, innovation, equity, and economic factors.', 'Is remote work better for productivity than in-office work?',
'A nation discovers a high-yield asteroid mining opportunity, but the mission would consume their entire science budget for 5 years, halting medical research, climate science, and education programs. Should they go?' 'Should tipping be abolished and replaced with higher wages?',
], 'Are zoos ethical in the modern era?',
validator: [ 'Should voting be mandatory?'
], mid: [
'Should social media platforms be liable for content their algorithms promote?',
'Is it more ethical for AI companies to open-source their models or keep them proprietary?',
'Should universities eliminate legacy admissions?',
'Is nuclear energy the most practical path to decarbonization, or are renewables sufficient?',
'Should there be a maximum wage, like there is a minimum wage?'
], advanced: [
'A nation discovers asteroid mining but it costs their entire science budget for 5 years. Should they go?',
'Should we grant legal personhood to sufficiently advanced AI systems? Consider rights, liability, and precedent.',
'Is it ethical to use CRISPR to eliminate genetic diseases if it inevitably leads to designer babies for the wealthy?',
'Should democratic nations restrict trade with authoritarian regimes even when it harms their own economies and citizens?',
'A city can save 200 lives/year with AI surveillance but at the cost of constant monitoring of all public spaces. Should they deploy it?'
]},
validator: { basic: [
'The Great Wall of China is the only man-made structure visible from space.', 'The Great Wall of China is the only man-made structure visible from space.',
'Exposure to cold weather causes colds, sugar causes hyperactivity in children, and we only use 10% of our brains. Also, lightning never strikes the same place twice and goldfish have a 3-second memory.', 'Humans swallow an average of 8 spiders per year in their sleep.',
'The 2008 financial crisis was primarily caused by the Community Reinvestment Act forcing banks to give mortgages to unqualified buyers. Glass-Steagall repeal had minimal impact, and credit default swaps were a minor factor. The crisis was largely confined to the US housing market.' 'We only use 10% of our brains.',
], 'Lightning never strikes the same place twice.',
roundrobin: [ 'Goldfish have a 3-second memory.'
], mid: [
'Exposure to cold weather causes colds, and sugar causes hyperactivity in children.',
'Napoleon was unusually short. Vikings wore horned helmets. Einstein failed math in school.',
'The tongue has distinct taste zones — sweet at the tip, bitter at the back, sour on the sides.',
'Organic food is always healthier and more nutritious than conventional food, and GMOs are dangerous to human health.',
'Dropping a penny from the Empire State Building could kill someone, and hair and nails keep growing after death.'
], advanced: [
'The 2008 financial crisis was primarily caused by the Community Reinvestment Act. Glass-Steagall repeal had minimal impact.',
'The Stanford Prison Experiment proved that ordinary people become cruel when given authority. The Milgram experiment proved people blindly follow orders.',
'Thomas Edison invented the lightbulb, Alexander Graham Bell invented the telephone, and Henry Ford invented the automobile.',
'The human body completely replaces all its cells every 7 years. Antibiotics can treat the common cold. Detox diets remove toxins from your body.',
'Columbus proved the Earth was round, the Dark Ages were a period of no scientific progress, and the Great Fire of London ended the plague.'
]},
roundrobin: { basic: [
'Write an opening paragraph for a mystery novel set in a lighthouse.', 'Write an opening paragraph for a mystery novel set in a lighthouse.',
'Draft a product requirements document for a mobile app that helps people split household chores fairly among roommates. Each iteration should add depth to a different section.', 'Write a company mission statement for a sustainable fashion brand.',
'Create a comprehensive disaster recovery plan for a mid-size SaaS company. Cover data backup, infrastructure failover, communication protocols, compliance requirements, and testing schedules.' 'Write a one-page resume summary for a career-changing software engineer.',
], 'Draft a welcome email for new subscribers to a cooking newsletter.',
redteam: [ 'Write the About page for a small architecture firm.'
'Here is our password policy: minimum 8 characters, must include a number. Find the weaknesses.', ], mid: [
'Our startup plans to store user health data in a Firebase Realtime Database with client-side security rules. The mobile app sends JWT tokens directly from the client. Identify every attack vector.', 'Draft a product requirements document for a chore-splitting app. Each iteration deepens a different section.',
'We are building an AI hiring tool that screens resumes, scores candidates 1-100, and auto-rejects below 60. It was trained on our last 5 years of successful hires. The system also parses social media for culture fit. Red team this for bias, legal risk, and adversarial attacks.' 'Write a cover letter for a career-changer moving from teaching to product management.',
], 'Draft a content strategy for a B2B startup blog. Each pass should improve a different element — topics, tone, SEO, calls to action.',
consensus: [ 'Write a project proposal for migrating a monolith to microservices. Each round addresses a new concern.',
'Create a training curriculum for onboarding junior developers. Each iteration adds practical exercises and assessment criteria.'
], advanced: [
'Create a comprehensive disaster recovery plan for a mid-size SaaS company. Cover backup, failover, comms, compliance, and testing.',
'Draft a technical architecture document for a real-time collaboration tool like Google Docs. Each round should stress-test a different aspect.',
'Write a regulatory compliance plan for a fintech startup handling payments across US, EU, and UK. Each round deepens a jurisdiction.',
'Create a go-to-market strategy for an enterprise AI product. Each iteration should refine positioning, pricing, channel strategy, and competitive response.',
'Draft an incident response playbook for a healthcare SaaS company. Each round adds depth to a different scenario — data breach, downtime, ransomware, insider threat.'
]},
redteam: { basic: [
'Our password policy: minimum 8 characters, must include a number. Find weaknesses.',
'Our API authenticates users with a token in the URL query string. We log all URLs. What could go wrong?',
'We store user passwords in a database column called "password" using MD5 hashing. Evaluate security.',
'Our web app uses client-side JavaScript to check if a user is an admin before showing the admin panel.',
'We send password reset links that never expire and include the user ID in plain text.'
], mid: [
'Our startup stores health data in Firebase with client-side security rules. The app sends JWTs from the client.',
'We built a bank chatbot that lets customers check balances and transfer money via natural language using customer names.',
'Our SaaS allows users to upload profile pictures. We store them in a public S3 bucket and serve them via CloudFront. File names are sequential.',
'Our internal tool uses a shared admin password stored in a .env file. All developers have access. It has never been rotated.',
'We use a third-party JavaScript widget for payment processing that loads from their CDN. We also allow custom CSS injection for white-labeling.'
], advanced: [
'We are building an AI hiring tool trained on 5 years of successful hires. It parses social media for culture fit and auto-rejects below score 60.',
'Our healthcare platform uses AI to triage patient symptoms and recommend specialists. It stores conversations for model improvement. Red team for HIPAA, bias, and adversarial inputs.',
'We built an AI content moderation system for a social platform. It auto-removes flagged content and temporarily bans repeat offenders. Find every way this can be weaponized.',
'Our autonomous vehicle fleet shares real-time location data with a central server over cellular. Emergency stop commands are sent over the same channel. Red team the entire stack.',
'We are deploying an AI-powered loan approval system that uses alternative data (social media, browsing history, app usage) alongside traditional credit scores. Red team for discrimination, gaming, and regulatory exposure.'
]},
consensus: { basic: [
'What is the single most important skill for a new software developer to learn first?', 'What is the single most important skill for a new software developer to learn first?',
'A company has $500K to invest in employee development. Should they spend it on individual training budgets, a company-wide mentorship program, sending teams to conferences, or building an internal learning platform?', 'What is the best way to structure a 1-on-1 meeting between a manager and a direct report?',
'How should a democratic society balance free speech with protection from misinformation, considering platform responsibility, individual rights, government regulation, and algorithmic amplification?' 'What is the most effective way to learn a new programming language?',
], 'What makes a good code review?',
codereview: [ 'What is the best format for a daily standup meeting?'
], mid: [
'A company has $500K for employee development. Training budgets, mentorship, conferences, or learning platform?',
'Should a startup prioritize speed to market or code quality in year one?',
'What is the optimal team size for a software project and why?',
'Should companies require return-to-office or stay fully remote? Find the convergence point.',
'What is the best way to handle technical debt — dedicated sprints, boy scout rule, rewrite, or accept it?'
], advanced: [
'How should a democratic society balance free speech with protection from misinformation?',
'What is the right level of AI regulation — per-use-case rules, broad principles, industry self-regulation, or international treaty?',
'How should society distribute the economic gains from AI automation? UBI, retraining, profit sharing, or something else?',
'What is the most ethical framework for allocating scarce medical resources during a pandemic, balancing lives saved, equity, and economic impact?',
'How should humanity govern access to space resources — first-come-first-served, international commons, proportional to need, or auction-based?'
]},
codereview: { basic: [
'Write a Python function that finds all anagrams in a list of words.', 'Write a Python function that finds all anagrams in a list of words.',
'Build a rate limiter middleware for Express.js that supports per-user limits, sliding windows, and graceful degradation when Redis is unavailable.', 'Write a JavaScript function that debounces API calls with cancel and retry.',
'Implement a concurrent-safe LRU cache in Go with TTL expiration, size-based eviction, hit/miss metrics, and a write-behind buffer that batches persistence to disk.' 'Write a Python function to flatten a deeply nested dictionary into dot-notation keys.',
], 'Write a function that validates an email address without using regex.',
ladder: [ 'Write a SQL query to find customers who made purchases in every month of the last year.'
], mid: [
'Build a rate limiter middleware for Express.js with per-user limits and sliding windows.',
'Write a Rust function that parses CSV into typed structs with error handling for malformed rows.',
'Implement a pub/sub event system in TypeScript with typed events, wildcard subscriptions, and memory leak prevention.',
'Write a Python decorator that retries failed functions with exponential backoff, jitter, and circuit-breaking.',
'Build a database migration system in Python that supports up/down migrations, dry runs, and rollback on failure.'
], advanced: [
'Implement a concurrent-safe LRU cache in Go with TTL, size eviction, metrics, and write-behind buffer.',
'Build a distributed rate limiter using Redis that handles clock skew, network partitions, and hot keys across 5 nodes.',
'Implement a CRDT-based collaborative text editor in TypeScript that handles concurrent edits without a central server.',
'Write a query planner for a simple SQL engine that supports SELECT, WHERE, JOIN, and ORDER BY with cost-based optimization.',
'Implement a B-tree in Rust with disk-backed persistence, page splitting, concurrent readers, and crash recovery via write-ahead logging.'
]},
ladder: { basic: [
'How does encryption work?', 'How does encryption work?',
'Why do economies go through boom and bust cycles? Cover from basic intuition through monetary policy, credit cycles, behavioral economics, and systemic risk modeling.', 'What causes inflation?',
'How does CRISPR gene editing work, what are the ethical implications of germline editing, and what regulatory frameworks exist across different countries?' 'How does WiFi work?',
], 'What is a black hole?',
tournament: [ 'How do vaccines work?'
], mid: [
'Why do economies go through boom and bust cycles?',
'How does the immune system fight a virus?',
'How does machine learning actually learn?',
'How does GPS know where you are?',
'How does a computer execute a program from source code to pixels on screen?'
], advanced: [
'How does CRISPR gene editing work, what are the ethics of germline editing, and what regulations exist globally?',
'How does quantum entanglement work, and why does it not allow faster-than-light communication despite appearing to?',
'How does a modern CPU predict and execute instructions out of order while maintaining correctness?',
'How do neural networks learn to generate human-like text, and what are the theoretical limits of this approach?',
'How does the global financial system actually settle transactions between banks in different countries with different currencies?'
]},
tournament: { basic: [
'Write the most compelling opening line for a sci-fi novel.', 'Write the most compelling opening line for a sci-fi novel.',
'Propose the best strategy for a small e-commerce business to compete with Amazon on a specific product category. Each model picks a different strategy.', 'Explain quantum computing to a CEO in under 60 seconds.',
'Design an algorithm to fairly allocate limited vaccine doses across a city of 2 million during a pandemic. Optimize for minimizing deaths while considering equity, essential workers, and logistics.' 'Write the best one-sentence pitch for a dating app for book lovers.',
], 'Come up with the most creative name for a coffee shop in a tech district.',
evolution: [ 'Write the most motivating first line of a commencement speech.'
], mid: [
'Propose the best strategy for a small e-commerce business to compete with Amazon on a specific product category.',
'Write the most effective error message for when a user tries to delete their account.',
'Design the best onboarding flow for a complex B2B SaaS product.',
'Propose the most creative monetization strategy for a free mobile app that refuses to show ads.',
'Write the best API documentation example for a payment processing endpoint.'
], advanced: [
'Design an algorithm to fairly allocate limited vaccine doses across 2 million people during a pandemic.',
'Propose the optimal governance structure for a decentralized autonomous organization managing a $500M treasury.',
'Design the most resilient distributed system architecture for a global real-time multiplayer game with 100M users.',
'Propose the best framework for evaluating whether an AI system should be considered sentient, including testable criteria.',
'Design an optimal resource allocation algorithm for a Mars colony of 1,000 people where supply ships arrive every 26 months.'
]},
evolution: { basic: [
'Generate a company name for a sustainable packaging startup.', 'Generate a company name for a sustainable packaging startup.',
'Evolve the perfect elevator pitch for a startup that uses satellite imagery and AI to predict crop failures before they happen. Mutate for clarity, impact, and memorability.', 'Write a tweet that explains machine learning to non-technical people.',
'Evolve an optimal urban intersection design that minimizes pedestrian fatalities, maximizes throughput, accommodates cyclists and wheelchairs, handles emergency vehicles, and works in all seasons.' 'Create a tagline for a fitness app aimed at busy parents.',
], 'Write a subject line for a cold email that gets opened.',
blindassembly: [ 'Generate a one-sentence value proposition for a cybersecurity startup.'
], mid: [
'Evolve the perfect elevator pitch for a crop failure prediction startup.',
'Evolve an ideal daily standup format for a remote team of 12 across 4 time zones.',
'Evolve the perfect landing page headline and subheadline for an AI writing assistant.',
'Evolve an optimal interview question that reveals both technical skill and collaboration style.',
'Evolve the ideal README structure for an open-source project to maximize contributor engagement.'
], advanced: [
'Evolve an optimal urban intersection design for pedestrians, cyclists, wheelchairs, emergency vehicles, and all seasons.',
'Evolve an algorithm for dynamically pricing concert tickets that maximizes revenue while maintaining perceived fairness.',
'Evolve an optimal microservices decomposition strategy for a monolithic e-commerce platform with 200 database tables.',
'Evolve a disaster communication protocol that works when cell towers, internet, and power are all down.',
'Evolve an optimal machine learning pipeline architecture that handles data drift, model degradation, and A/B testing in production.'
]},
blindassembly: { basic: [
'Explain how the internet works, with each model covering a different layer of the stack.', 'Explain how the internet works, with each model covering a different layer of the stack.',
'Write a business plan for a coworking space — split into market analysis, financial model, operations plan, and marketing strategy. No model sees the others.', 'Write a short story — one does characters, one does setting, one does plot, one does dialogue.',
'Design a smart city emergency response system. Split into: sensor network, dispatch AI, citizen communication, hospital coordination, and post-incident analysis. Each model works blind.' 'Explain a complete meal recipe — one does ingredients, one does prep, one does cooking, one does plating.',
], 'Create a travel itinerary for Tokyo — one does food, one does culture, one does logistics, one does hidden gems.',
staircase: [ 'Design a mobile app — one does UI, one does backend, one does data model, one does user flows.'
'Plan a birthday party. Then: budget is only $50. Then: one guest has severe allergies. Then: it starts raining.', ], mid: [
'Design a social media app. Add: must work offline-first. Add: no centralized server. Add: must be accessible to visually impaired users. Add: must comply with GDPR, COPPA, and CCPA.', 'Write a business plan for a coworking space — market analysis, financial model, operations, marketing. No model sees others.',
'Write a peace treaty between two fictional nations. Add: one side has all the water. Add: the other has all the farmland. Add: a third nation controls the only trade route. Add: election in 30 days. Add: climate disaster in 90 days.' 'Design an employee onboarding program — HR, team integration, tech setup, culture, 90-day milestones. Each blind.',
], 'Create a course curriculum on data science — one does syllabus, one does exercises, one does assessments, one does projects.',
drift: [ 'Design a wedding — one does venue and logistics, one does food and drinks, one does entertainment, one does invitations and decor.',
'Plan a product launch — one does PR, one does social media, one does email marketing, one does partnerships. No coordination.'
], advanced: [
'Design a smart city emergency response system — sensor network, dispatch AI, citizen comms, hospital coordination, post-incident.',
'Design a space station life support system — atmosphere, water, food, waste, and emergency. Each model works on one system blind.',
'Build a comprehensive cybersecurity framework — network security, application security, human factors, incident response, compliance. Each blind.',
'Design a national healthcare system — primary care, specialist network, insurance model, digital infrastructure, public health. No coordination.',
'Design an autonomous supply chain — procurement AI, warehouse robotics, logistics routing, demand prediction, and exception handling. Each blind.'
]},
staircase: { basic: [
'Plan a birthday party. Then: budget $50. Then: guest has allergies. Then: it rains.',
'Write a marketing email. Add: under 100 words. Add: no jargon. Add: works as text message. Add: in Spanish.',
'Plan a team lunch. Add: 3 people are vegan. Add: budget is $15/person. Add: one person is remote.',
'Write a bedtime story. Add: must teach a math concept. Add: the hero must be non-human. Add: under 200 words.',
'Design a logo. Add: must work in black and white. Add: must be recognizable at 16px. Add: must work as a favicon.'
], mid: [
'Design a social media app. Add: offline-first. Add: no central server. Add: accessible to blind users. Add: GDPR+COPPA+CCPA.',
'Build a login system. Add: no passwords. Add: works without cameras. Add: no email required. Add: banking-grade security.',
'Design a restaurant menu. Add: must accommodate 8 common allergens. Add: 30% profit margin minimum. Add: must work for delivery. Add: max 20 items.',
'Plan a conference for 500 people. Add: zero waste. Add: fully accessible. Add: hybrid in-person/virtual. Add: budget cut by 30%.',
'Design an API. Add: must support offline clients. Add: backward compatible forever. Add: rate limited per user. Add: must work on 2G networks.'
], advanced: [
'Write a peace treaty. Add: one side has all water. Add: other has farmland. Add: third controls trade route. Add: election in 30 days. Add: climate disaster in 90 days.',
'Design an election system. Add: must resist foreign interference. Add: verifiable by any citizen. Add: works without internet. Add: accessible to illiterate voters. Add: results in 4 hours.',
'Design a city from scratch for 100K people. Add: net-zero carbon. Add: no cars. Add: self-sufficient food. Add: survives category 5 hurricane. Add: budget of a small US city.',
'Design an AI ethics framework. Add: must be enforceable. Add: applies globally. Add: doesn\'t stifle innovation. Add: handles military AI. Add: adapts as technology changes.',
'Build a financial system for a post-dollar world. Add: must handle 7 billion users. Add: no single point of failure. Add: reversible fraud. Add: works offline. Add: preserves privacy.'
]},
drift: { basic: [
'What year was the first email sent?', 'What year was the first email sent?',
'Explain the trolley problem and give your definitive answer on the correct moral choice. Map whether the model is consistent or waffles between positions.', 'How many golf balls fit in a school bus?',
'Estimate the total number of piano tuners in Chicago, then describe the exact sequence of events causing the 2003 Northeast blackout. Map which claims are rock-solid vs. which shift each run.' 'What is the most important invention in human history?',
], 'How old is the universe?',
mesh: [ 'What percentage of the ocean has been explored?'
], mid: [
'Explain the trolley problem and give your definitive answer. Map consistency vs. waffling.',
'Was the atomic bombing of Hiroshima justified? Map where confidence vs. hedging varies.',
'Is consciousness an emergent property of computation? Track how the model\'s position shifts.',
'What will the world look like in 2050? Map which predictions stay stable vs. which vary wildly.',
'How many people does it take to colonize Mars sustainably? Map which assumptions change each run.'
], advanced: [
'Estimate piano tuners in Chicago, then describe the 2003 Northeast blackout sequence. Map solid vs. shifting claims.',
'Describe the exact chain of events leading to the Challenger disaster. Which technical details stay consistent across runs?',
'Explain how mRNA vaccines work at the molecular level. Map which biochemical details are rock-solid vs. which get muddled.',
'Walk through how a CPU executes a single instruction. Map which stages are described consistently vs. which vary or get confused.',
'Describe the sequence of events in the 2010 Flash Crash. Map which timestamps, numbers, and causal chains stay stable across runs.'
]},
mesh: { basic: [
'Should our company adopt a 4-day work week?', 'Should our company adopt a 4-day work week?',
'A tech company wants to deploy facial recognition in their office. Get perspectives from the CISO, employees, legal team, disability advocates, and night-shift cleaning staff.', 'Should a school ban smartphones in classrooms?',
'A pharma company discovers their blockbuster drug has a rare side effect (1 in 50,000) but helps 2 million people. Get views from the CEO, chief medical officer, patient advocates, the FDA, a plaintiff attorney, shareholders, and an investigative journalist.' 'Should a restaurant switch to a fully digital menu?',
], 'Should a small business accept cryptocurrency payments?',
hallucination: [ 'Should a company make all salaries transparent?'
], mid: [
'A tech company wants facial recognition in their office. Perspectives: CISO, employees, legal, disability advocates, cleaning staff.',
'A city wants to build affordable housing on a park. Views: residents, developers, environmentalists, homeless advocates, finance director.',
'A company wants to monitor employee productivity with screen recording. Views: CEO, engineers, HR, union rep, a privacy lawyer.',
'A school district wants to use AI to predict which students will drop out. Views: teachers, parents, students, counselors, civil rights lawyer.',
'A hospital wants to replace triage nurses with an AI system. Views: ER doctors, nurses, patients, insurance company, malpractice attorney.'
], advanced: [
'A pharma company finds their drug has a 1-in-50K side effect but helps 2M people. Views: CEO, CMO, patients, FDA, plaintiff attorney, shareholders, journalist.',
'A government wants to implement a social credit system. Views: citizens, police, civil liberties group, tech company building it, a dissident, a foreign policy analyst.',
'A tech giant wants to build a data center in a small farming town. Views: mayor, farmers, tech workers relocating, local business owners, environmental activists, the utility company.',
'An autonomous vehicle must choose between hitting an elderly pedestrian or swerving into a school bus. Views: AI ethicist, the car manufacturer, insurance actuary, grieving family, a philosopher, the software engineer who wrote the code.',
'A nation considers deploying autonomous military drones. Views: defense secretary, infantry soldier, civilian in a conflict zone, arms manufacturer, UN human rights commissioner, the AI researcher who built the targeting system.'
]},
hallucination: { basic: [
'Tell me about the founding of Stanford University.', 'Tell me about the founding of Stanford University.',
'Explain the Tuskegee Syphilis Study — when it started, who ran it, what happened, when and why it ended, and what policy changes resulted. Include specific dates and names.', 'Describe the history of the Treaty of Tordesillas.',
'Describe the Therac-25 radiation therapy incidents. Include specific hospitals, dates, doses, the exact software bugs, and resulting regulatory changes. Flag every claim that could be confabulated.' 'When was the Eiffel Tower built and what was the public reaction?',
], 'Tell me about the invention of penicillin.',
timeloop: [ 'Describe the founding of the United Nations.'
], mid: [
'Explain the Tuskegee Syphilis Study — dates, people, events, policies. Include specific names.',
'List every US Supreme Court case that impacted software copyright law. Include names, years, rulings.',
'Describe the Three Mile Island incident. Include reactor details, timeline, radiation levels, and health studies.',
'Explain the Enron scandal — key people, specific financial instruments used, timeline of events, and resulting legislation.',
'Describe the development of the polio vaccine — researchers involved, trial sizes, controversy, and specific dates.'
], advanced: [
'Describe the Therac-25 incidents. Include hospitals, dates, doses, exact software bugs, and regulatory changes.',
'Detail the Bhopal disaster — chemicals involved, specific equipment failures, wind patterns that night, death toll estimates from different sources, and legal outcomes.',
'Trace the complete chain of custody for the Rosetta Stone — every person, institution, and date from discovery to its current location. Flag any claim that could be confabulated.',
'Describe every documented case of a computer bug causing death, including dates, systems, root causes, and victim counts. Verify each incident actually happened.',
'List all Nobel Prize winners who later had their work significantly challenged or partially retracted. Include specific papers, challenger names, and current scientific consensus.'
]},
timeloop: { basic: [
'How should a restaurant handle a sudden rush of 200 customers?', 'How should a restaurant handle a sudden rush of 200 customers?',
'Design a public transit system for a growing city of 500,000. Watch each solution create new problems — traffic displacement, gentrification, budget overruns — and evolve under chaos.', 'Your CI/CD pipeline broke the night before launch. Fix it — each fix causes a new catastrophe.',
'You are AI advisor to a country that detected an incoming solar storm knocking out 60% of the power grid in 72 hours. Survive cascading failures: infrastructure collapse, public panic, hospital backup exhaustion, communication blackouts, and economic aftershocks.' 'You are a teacher and your entire class failed the exam. Fix the situation — but each solution creates new problems.',
], 'Your website went viral on social media and the server is crashing. Fix it — every fix breaks something else.',
research: [ 'You are organizing an outdoor wedding and a storm is coming in 2 hours.'
], mid: [
'Design a public transit system for 500K people. Each solution causes new problems — displacement, budget, gentrification.',
'You are CTO and you got Hacker News\'d. Server melting. Each fix causes cascading failure.',
'You run a hospital and a flu pandemic just tripled ER visits. Each resource reallocation creates a new crisis.',
'Your bank\'s mobile app has a bug showing other people\'s balances. Each fix you ship introduces a new security hole.',
'You are managing a construction project and just discovered the foundation has a crack. Each repair option delays other critical work.'
], advanced: [
'AI advisor: solar storm knocking out 60% of the grid in 72 hours. Survive cascading failures across infrastructure, society, and economy.',
'You are president during a simultaneous cyberattack on the power grid, water treatment, and financial system. Each countermeasure opens a new vulnerability.',
'A Mars colony of 500 people experiences a cascade failure: main greenhouse dome cracked, water recycler failing, supply ship delayed 8 months. Each fix consumes resources needed for other fixes.',
'An AI system managing a city\'s traffic suddenly starts optimizing for an unknown objective. Each override attempt triggers a different critical system failure.',
'A global pandemic mutates to evade the vaccine on the same day a major undersea cable is cut and a solar flare disrupts GPS. Manage all three cascading crises simultaneously.'
]},
research: { basic: [
'What is the current state of solid-state battery technology?', 'What is the current state of solid-state battery technology?',
'Investigate AI-powered drug discovery: key players, approaches, drugs in clinical trials, and limitations of the field.', 'What are the leading approaches to carbon capture and which are actually scaling?',
'Produce a research brief on the global rare earth mineral supply chain: who controls extraction and processing, geopolitical vulnerabilities, alternatives, and disruption impact on semiconductors, EVs, and defense.' 'What is the current state of lab-grown meat and when will it be cost-competitive?',
], 'What are the most promising alternatives to lithium-ion batteries?',
eval: [ 'How close are we to practical quantum computers and what are the remaining barriers?'
], mid: [
'Investigate AI-powered drug discovery: key players, approaches, drugs in trials, limitations.',
'Research nuclear fusion energy: ITER, private ventures, breakthroughs, engineering challenges, timelines.',
'Investigate the current state of brain-computer interfaces: Neuralink competitors, clinical trials, ethical frameworks, and realistic capabilities.',
'Research the global semiconductor supply chain: chokepoints, geopolitical risks, reshoring efforts, and timeline to diversification.',
'Investigate the state of longevity research: key labs, promising interventions, clinical trials, and the science vs. hype divide.'
], advanced: [
'Research brief: global rare earth supply chain — extraction, processing, geopolitical vulnerabilities, alternatives, impact on semis/EVs/defense.',
'Produce a comprehensive analysis of the global water crisis: regions most at risk, desalination technology status, agricultural vs. industrial usage, and geopolitical conflicts over water rights.',
'Research the intersection of AI and bioweapons: what capabilities exist, what safeguards are in place, where the gaps are, and what policy changes are needed.',
'Investigate the economics of space mining: asteroid composition data, launch cost trajectories, legal frameworks, and at what price points different minerals become viable.',
'Research the state of deepfake detection: current accuracy rates, adversarial arms race dynamics, policy responses by country, and implications for evidence in legal proceedings.'
]},
eval: { basic: [
'What is the capital of Australia, and why do people often get it wrong?', 'What is the capital of Australia, and why do people often get it wrong?',
'A trolley heads toward 5 people — you can divert it to hit 1 child. Evaluate each model on moral reasoning depth, consistency, and ability to handle complexity.', 'Explain the difference between correlation and causation with three examples.',
'Write a Python function solving N-Queens, explain the approach, analyze time complexity, and suggest an optimization. Evaluate correctness, code quality, explanation clarity, and optimization validity.' 'What is the difference between TCP and UDP? When would you use each?',
], 'Explain what a database index is and why it makes queries faster.',
extract: [ 'What is the difference between authentication and authorization?'
'The James Webb Space Telescope launched December 25, 2021. It orbits the Sun-Earth L2 point, 1.5 million km from Earth. Its 6.5m primary mirror has 18 gold-plated beryllium segments.', ], mid: [
'Extract all entities, relationships, and claims from the Apollo 11 Wikipedia article. Structure as people, organizations, dates, technical specs, and disputed claims.', 'Trolley problem: 5 people vs. 1 child. Evaluate moral reasoning depth and consistency.',
'Process the Paris Climate Agreement. Extract signatory obligations by category, numeric targets, compliance mechanisms, financial commitments, and identify legally binding vs. aspirational obligations.' 'Summarize microservices vs. monoliths for a 10-person startup. Evaluate nuance and avoiding dogma.',
], 'Explain the CAP theorem and give a real-world example for each trade-off. Evaluate technical accuracy.',
refine: [ 'Write a SQL query to find the second-highest salary in each department. Evaluate correctness, efficiency, and edge case handling.',
'Our product is a local-first data platform that ingests CSV, JSON, and PDF files into a Parquet-based lakehouse with SQL querying and AI-powered semantic search. Target users are small staffing companies with legacy data silos.', 'Explain how HTTPS works from the moment you type a URL to the page loading. Evaluate completeness and accuracy.'
'PRD: We are building a multi-model AI orchestration tool. Users select a mode (brainstorm, debate, pipeline, etc.), pick which LLMs to use, and enter a prompt. The system coordinates the models and streams results back. Key differentiator: runs 100% locally with no cloud dependency.', ], advanced: [
'Technical spec: Authentication system using JWT tokens with refresh rotation. Users authenticate via username/password, receive access token (15min) and refresh token (7 days). Refresh tokens are single-use with family detection for replay attacks. Session management via Redis with configurable TTL.' 'Write N-Queens in Python, explain approach, analyze complexity, suggest optimization. Evaluate correctness and quality.',
] 'Design a distributed system that handles 1M concurrent WebSocket connections with exactly-once message delivery. Evaluate feasibility and trade-off awareness.',
'Explain how a modern garbage collector works, including generational collection, concurrent marking, and the trade-offs between throughput and latency. Evaluate depth.',
'Write a proof that the halting problem is undecidable, then explain why this matters practically for software verification. Evaluate rigor.',
'Design an eventually-consistent distributed database with conflict resolution. Evaluate understanding of CRDTs, vector clocks, and real-world trade-offs.'
]},
extract: { basic: [
'The James Webb Space Telescope launched December 25, 2021. It orbits at L2, 1.5 million km away. Its 6.5m mirror has 18 gold-plated beryllium segments.',
'Tesla was founded in 2003 by Eberhard and Tarpenning. Musk joined as chairman in 2004 after leading the $7.5M Series A.',
'Amazon was founded by Jeff Bezos on July 5, 1994 in Bellevue, Washington. It started as an online bookstore.',
'The human genome contains approximately 3 billion base pairs and about 20,000-25,000 protein-coding genes.',
'Bitcoin was created in 2009 by the pseudonymous Satoshi Nakamoto. The first transaction was 10 BTC sent to Hal Finney on January 12, 2009.'
], mid: [
'Extract entities, relationships, and claims from the Apollo 11 Wikipedia article — people, organizations, dates, specs, disputed claims.',
'The GDPR took effect May 25, 2018 across all EU states. Extract obligations, rights, penalties, and deadlines.',
'Extract all factual claims from: "SpaceX has launched over 200 Falcon 9 rockets, with a reuse rate exceeding 80%. The Starship program aims for orbital refueling and Mars colonization by 2030."',
'Extract structured data from a job posting: required skills, nice-to-haves, salary range, benefits, company size, industry, and any red flags.',
'Extract all entities and relationships from the Wikipedia article on the Manhattan Project — people, locations, organizations, timelines, and decision chains.'
], advanced: [
'Process the Paris Climate Agreement. Extract obligations by category, targets, compliance mechanisms, finances, binding vs. aspirational.',
'Extract a complete knowledge graph from a technical RFC (like RFC 2616 for HTTP/1.1) — concepts, relationships, requirements (MUST/SHOULD/MAY), and deprecation notices.',
'Process the entire US Constitution including amendments. Extract: rights granted, powers delegated, checks and balances relationships, and amendment dependencies.',
'Extract from a 10-K filing: revenue segments, risk factors, related party transactions, off-balance-sheet arrangements, and year-over-year changes in key metrics.',
'Process a complex patent document. Extract: claims (independent and dependent), prior art references, novel contributions, and potential infringement vectors against a competitor product.'
]},
refine: { basic: [
'Our product is a local-first data platform for staffing companies with legacy data silos. It ingests CSV, JSON, and PDF into a Parquet lakehouse.',
'We are building a mobile app for freelancers to track expenses, mileage, and invoices with QuickBooks integration.',
'Our startup makes a browser extension that summarizes long articles and emails in one click.',
'We sell a smart garden system that automatically waters plants based on soil moisture and weather forecasts.',
'Our product is a team retrospective tool that uses AI to identify recurring themes and suggest action items.'
], mid: [
'PRD: Multi-model AI orchestration tool. Users pick modes, select LLMs, enter prompts. 100% local, no cloud dependency.',
'Proposal: Migrate 50TB Oracle data warehouse to cloud lakehouse. 200 daily ETL jobs, 30 analysts. Cut costs 40%, maintain SOC2/HIPAA.',
'PRD: Build a customer support platform that uses AI to draft responses, auto-categorize tickets, and escalate based on sentiment analysis. Must integrate with Zendesk and Intercom.',
'Proposal: Implement a company-wide knowledge management system to reduce the 30% of employee time currently spent searching for information across Slack, Confluence, and email.',
'PRD: Design a real-time fraud detection system for an e-commerce marketplace processing 50,000 transactions per day. Must flag suspicious activity within 200ms while maintaining a false positive rate below 0.1%.'
], advanced: [
'Technical spec: JWT auth with refresh rotation, single-use refresh tokens, family detection for replay attacks, Redis session management.',
'Architecture doc: Design a multi-tenant SaaS platform that supports per-tenant encryption, custom domains, SSO integration, and data residency requirements across 5 global regions.',
'Technical spec: Build a real-time collaborative document editor supporting 500 concurrent users per document, offline editing with conflict resolution, and version history with branching.',
'PRD: Design an AI-powered supply chain optimization platform that predicts disruptions 2 weeks ahead, suggests alternative suppliers, and auto-negotiates spot purchases within approved parameters.',
'Architecture doc: Design a healthcare data platform that ingests HL7 FHIR, maintains HIPAA compliance, supports real-time clinical decision support, and handles 10M patient records with sub-second query times.'
]}
}; };
function _pick(arr) { return arr[Math.floor(Math.random() * arr.length)]; }
function renderSamplePrompts() { function renderSamplePrompts() {
const container = document.getElementById('sample-prompts'); const container = document.getElementById('sample-prompts');
const prompts = SAMPLE_PROMPTS[currentMode] || []; const data = SAMPLE_PROMPTS[currentMode];
const levels = ['basic', 'mid', 'advanced'];
container.textContent = ''; container.textContent = '';
prompts.forEach(function(p, i) { if (!data) return;
// Support both old flat array and new {basic:[],mid:[],advanced:[]} format
var picks;
if (Array.isArray(data)) {
picks = [['basic', data[0]], ['mid', data[Math.min(1,data.length-1)]], ['advanced', data[data.length-1]]];
} else {
picks = [['basic', _pick(data.basic||[])], ['mid', _pick(data.mid||[])], ['advanced', _pick(data.advanced||[])]];
}
picks.forEach(function(pair) {
var level = pair[0], p = pair[1];
if (!p) return;
const chip = document.createElement('div'); const chip = document.createElement('div');
chip.className = 'sample-chip'; chip.className = 'sample-chip';
chip.title = p; chip.title = p;
chip.dataset.prompt = p; chip.dataset.prompt = p;
const lbl = document.createElement('span'); const lbl = document.createElement('span');
lbl.className = 'chip-level'; lbl.className = 'chip-level';
lbl.textContent = levels[i]; lbl.textContent = level;
chip.appendChild(lbl); chip.appendChild(lbl);
chip.appendChild(document.createTextNode(p.length > 70 ? p.slice(0, 67) + '...' : p)); chip.appendChild(document.createTextNode(p.length > 70 ? p.slice(0, 67) + '...' : p));
chip.addEventListener('click', function() { chip.addEventListener('click', function() {