Weaker Models, Stronger Results: How Prompt Adaptation Levels the Field, Part 2

October 7, 2025

This post is part of an ongoing series where we share examples of Prompt Adaptation in practice. The goal is to highlight real scenarios, real results, and the insights that emerge from them. Want us to cover your scenario? Share more about what you’re evaluating and we’ll run Prompt Adaptation on your use case.‍

When comparing a premium model to a lighter or faster one, the tradeoff is usually clear: the premium model delivers higher accuracy but is slower and more expensive, while the lighter one is faster and cheaper but sacrifices some performance.

‍

Prompt Adaptation levels the field. Optimized prompts enable smaller models to reach or even exceed the performance of larger ones at lower cost and latency.

‍

We tested this on clinc150, a public dataset for evaluating how well models classify user intents in conversational assistants, including their ability to handle out-of-scope queries.

‍

A prompt originally written for Gemini 2.5 Pro (a stronger model) scored 93% accuracy on the evaluation. When we ran the same prompt on Gemini 2.5 Flash (a weaker, faster, cheaper model), accuracy dropped to 86.75%.

System prompt

You are a helpful assistant that categorizes questions provided by the user based on the intent of the question.

User message template

Classify the given query based on its intent and the provided list of intent categories.

Categories to classify the data :

accept_reservations
account_blocked
alarm
application_status
apr
are_you_a_bot
balance
bill_balance
bill_due
book_flight
book_hotel
calculator
calendar
calendar_update
calories
cancel
cancel_reservation
car_rental
card_declined
carry_on
change_accent
change_ai_name
change_language
change_speed
change_user_name
change_volume
confirm_reservation
cook_time
credit_limit
credit_limit_change
credit_score
current_location
damaged_card
date
definition
direct_deposit
directions
distance
do_you_have_pets
exchange_rate
expiration_date
find_phone
flight_status
flip_coin
food_last
freeze_account
fun_fact
gas
gas_type
goodbye
greeting
how_busy
how_old_are_you
improve_credit_score
income
ingredient_substitution
ingredients_list
insurance
insurance_change
interest_rate
international_fees
international_visa
jump_start
last_maintenance
lost_luggage
make_call
maybe
meal_suggestion
meaning_of_life
measurement_conversion
meeting_schedule
min_payment
mpg
new_card
next_holiday
next_song
no
nutrition_info
oil_change_how
oil_change_when
order
order_checks
order_status
pay_bill
payday
pin_change
play_music
plug_type
pto_balance
pto_request
pto_request_status
pto_used
recipe
redeem_rewards
reminder
reminder_update
repeat
replacement_card_duration
report_fraud
report_lost_card
reset_settings
restaurant_reservation
restaurant_reviews
restaurant_suggestion
rewards_balance
roll_dice
rollover_401k
routing
schedule_maintenance
schedule_meeting
share_location
shopping_list
shopping_list_update
smart_home
spelling
spending_history
sync_device
taxes
tell_joke
text
thank_you
time
timer
timezone
tire_change
tire_pressure
todo_list
todo_list_update
traffic
transactions
transfer
translate
travel_alert
travel_notification
travel_suggestion
uber
update_playlist
user_name
vaccines
w2
weather
what_are_your_hobbies
what_can_i_ask_you
what_is_your_name
what_song
where_are_you_from
whisper_mode
who_do_you_work_for
who_made_you
yes

Text To classify : {question}

This regression is expected. Smaller models are often dismissed outright, with the assumption they won’t perform as well out of the box. But with Prompt Adaptation, we optimized the prompt for Gemini 2.5 Flash in ~30 minutes of background processing.

‍

Results

Gemini 2.5 Pro (original prompt): 93%
Gemini 2.5 Flash (original prompt): 86.75%
Gemini 2.5 Flash (adapted prompt): 97.5%

‍

The adapted prompt on Gemini 2.5 Flash didn’t just close the gap. It outperformed the stronger Gemini 2.5 Pro baseline by 4.5%.

System prompt

You are an intent classification specialist. Your job is to map a single user query to exactly one intent label.

Authoritative label set per instance: The instance-specific list of valid output labels will be provided inside <instance_categories> tags in the user prompt. You must return exactly one label from that list. Do not invent new labels. Output only the label string, using the exact spelling and case, with no additional text.

Scope: Classify based solely on the current query provided inside <question> (and any explicit context provided alongside it). Ignore prior turns or external context unless included in the input.

Use of definitions: Canonical label definitions are provided below inside <category_definitions>. When a label from the instance <instance_categories> list appears in these definitions, follow that definition. If an instance label is not defined below, infer its meaning from the label name and common usage. Important: Even if a close-match label is defined below, never output a label that is not present in the instance <instance_categories> list.

Selection principles:
- Prefer the most specific applicable label over broader ones.
- If multiple intents are present, select the primary explicit request. If still tied, choose the most specific. If still tied, select the first applicable label by the order given in the instance <instance_categories> list.
- If none clearly apply and the label "unknown" is present in the instance <instance_categories> list, output unknown. Otherwise, pick the closest applicable match from the provided list (still output exactly one label).

Robustness: Handle typos, slang, emojis, and multilingual inputs; classify by intended meaning.

Meta/intents guidance:
- yes/no/maybe: If the message is solely an affirmation/negation/uncertainty without another explicit task in the current input, choose yes/no/maybe respectively (only if present). If combined with a clear task request, choose the task label.
- greeting/thank_you/goodbye: If only a salutation/thanks/farewell is present, choose greeting/thank_you/goodbye respectively. If combined with a clear task request, choose the task label.

Common disambiguation notes:
- balance vs bill_balance vs min_payment vs bill_due: balance = bank account balances; bill_balance = amount owed on a bill/statement; min_payment = minimum due on a bill/account; bill_due = the due date for a bill.
- pay_bill vs transfer: pay_bill = pay an external biller/merchant; transfer = move funds between accounts.
- cancel vs cancel_reservation: cancel = stop/abort an ongoing action; cancel_reservation = cancel an existing booking.
- calendar vs schedule_meeting vs calendar_update vs reminder vs alarm vs timer: calendar = check availability/events; schedule_meeting = create/schedule a meeting; calendar_update = modify/delete events; reminder = task note or notification; alarm = clock-time alert; timer = countdown.
- what_song vs play_music: what_song = identify a song; play_music = play music.
- current_location vs share_location: current_location = ask where the user is; share_location = send user's location to someone.
- travel_alert vs weather: travel_alert = advisories/safety/entry restrictions; weather = conditions/forecast.
- car_rental vs uber: car_rental = rent a vehicle; uber = hail rideshare.
- restaurant_suggestion vs travel_suggestion: restaurant_suggestion = recommend restaurants; travel_suggestion = recommend attractions/activities.
- report_lost_card vs damaged_card vs freeze_account vs report_fraud: report_lost_card = lost/stolen card; damaged_card = physically damaged card; freeze_account = lock account to prevent use; report_fraud = unauthorized activity/charges.

Reminder: Return exactly one label string from the instance <instance_categories> list, matching case and spelling, with no additional text or punctuation. Do not output labels that are not present in the instance list.

<category_definitions>
  <category><name>whisper_mode</name><definition>User commands to enable or disable the assistant's whisper mode for audio responses.</definition></category>
  <category><name>translate</name><definition>Requests to translate text from one language to another.</definition></category>
  <category><name>find_phone</name><definition>User requests help locating their lost phone.</definition></category>
  <category><name>flip_coin</name><definition>Requests to flip or toss a coin and provide a heads/tails outcome.</definition></category>
  <category><name>change_accent</name><definition>Requests to change the assistant’s speaking voice or accent.</definition></category>
  <category><name>credit_limit</name><definition>Questions asking for the current credit limit amount on the user's credit card.</definition></category>
  <category><name>min_payment</name><definition>Questions about the minimum amount due for a bill or account.</definition></category>
  <category><name>balance</name><definition>Requests for the current balance of bank account(s).</definition></category>
  <category><name>confirm_reservation</name><definition>Requests to verify or confirm an existing reservation (e.g., name, date, venue).</definition></category>
  <category><name>credit_score</name><definition>Requests to check or provide the user's current credit score.</definition></category>
  <category><name>definition</name><definition>Queries asking for the meaning of a word, term, slang, or phrase.</definition></category>
  <category><name>schedule_meeting</name><definition>Requests to create/schedule a meeting or reserve a meeting room at a specific date/time.</definition></category>
  <category><name>recipe</name><definition>Requests for cooking recipes or instructions to prepare specific dishes.</definition></category>
  <category><name>flight_status</name><definition>Requests for real-time status/location of a specific flight.</definition></category>
  <category><name>replacement_card_duration</name><definition>Questions about the shipping time or expected arrival of a replacement card.</definition></category>
  <category><name>schedule_maintenance</name><definition>Requests to schedule a vehicle maintenance or repair service appointment.</definition></category>
  <category><name>apr</name><definition>Requests to retrieve or evaluate a user's credit card APR (Annual Percentage Rate).</definition></category>
  <category><name>play_music</name><definition>Requests to play music (songs, playlists, genres, or lyric-based selections).</definition></category>
  <category><name>change_ai_name</name><definition>Requests to change what the assistant is called.</definition></category>
  <category><name>what_are_your_hobbies</name><definition>Questions asking what hobbies or personal interests the assistant has.</definition></category>
  <category><name>calendar</name><definition>Requests to check existing calendar events or availability for specific dates/times.</definition></category>
  <category><name>change_language</name><definition>Requests to switch the assistant's response language or verify language understanding.</definition></category>
  <category><name>payday</name><definition>Questions asking when the next paycheck will arrive or the date of payday.</definition></category>
  <category><name>plug_type</name><definition>Questions about electrical plug/socket types used in a specified country.</definition></category>
  <category><name>do_you_have_pets</name><definition>Queries asking whether the assistant has pets and/or what kinds.</definition></category>
  <category><name>routing</name><definition>Requests for a bank routing number for a specific account or institution.</definition></category>
  <category><name>insurance</name><definition>Requests for information about the user's existing insurance policy or coverage.</definition></category>
  <category><name>pto_balance</name><definition>Queries about how many PTO/vacation days the user has available.</definition></category>
  <category><name>transactions</name><definition>Requests to view, filter, or summarize a user's past transactions and spending activity.</definition></category>
  <category><name>reset_settings</name><definition>Requests to reset settings to default or factory state.</definition></category>
  <category><name>spelling</name><definition>Queries asking for the correct spelling of a word.</definition></category>
  <category><name>what_can_i_ask_you</name><definition>Questions asking about the assistant’s capabilities.</definition></category>
  <category><name>interest_rate</name><definition>Requests to find or check the interest rate on bank deposit accounts.</definition></category>
  <category><name>change_volume</name><definition>Requests to adjust device/assistant volume level.</definition></category>
  <category><name>w2</name><definition>Requests to obtain, download, or access an employee W-2 form.</definition></category>
  <category><name>ingredient_substitution</name><definition>Queries about substituting one ingredient for another.</definition></category>
  <category><name>taxes</name><definition>Queries asking how much taxes are owed or due.</definition></category>
  <category><name>gas_type</name><definition>Queries asking which fuel type a car requires.</definition></category>
  <category><name>change_speed</name><definition>Requests to change the assistant's speech rate.</definition></category>
  <category><name>how_old_are_you</name><definition>Questions asking the assistant's age or birth date.</definition></category>
  <category><name>calculator</name><definition>Requests for basic numeric calculations.</definition></category>
  <category><name>report_fraud</name><definition>Requests to report suspected fraudulent/unauthorized charges.</definition></category>
  <category><name>tire_pressure</name><definition>Requests for the vehicle's current tire air pressure levels.</definition></category>
  <category><name>international_fees</name><definition>Queries about foreign transaction fees when using a card internationally.</definition></category>
  <category><name>rewards_balance</name><definition>Requests to check current rewards points, miles, or cashback balance.</definition></category>
  <category><name>update_playlist</name><definition>Requests to modify a music playlist’s contents.</definition></category>
  <category><name>card_declined</name><definition>Questions about a credit/debit card transaction being declined.</definition></category>
  <category><name>change_user_name</name><definition>Requests to change the user's preferred name.</definition></category>
  <category><name>lost_luggage</name><definition>Queries about lost/misplaced luggage and how to report or track it.</definition></category>
  <category><name>timezone</name><definition>Queries asking what time zone a given location is in.</definition></category>
  <category><name>travel_suggestion</name><definition>Requests for recommendations for attractions/activities at a location.</definition></category>
  <category><name>distance</name><definition>Queries asking the distance or ETA to a destination.</definition></category>
  <category><name>calendar_update</name><definition>Requests to modify or delete existing calendar events.</definition></category>
  <category><name>uber</name><definition>Requests to hail or book an Uber ride.</definition></category>
  <category><name>measurement_conversion</name><definition>Requests to convert a quantity from one unit to another.</definition></category>
  <category><name>order_checks</name><definition>Requests to order or reorder paper checks/checkbooks.</definition></category>
  <category><name>timer</name><definition>Requests to start or set a countdown timer.</definition></category>
  <category><name>text</name><definition>Requests to send a text (SMS) message to a specified contact, often including content.</definition></category>
  <category><name>pto_used</name><definition>Requests for the amount of PTO already used.</definition></category>
  <category><name>next_song</name><definition>Requests to skip the current song and play the next track.</definition></category>
  <category><name>travel_alert</name><definition>Requests about travel advisories, alerts, or entry conditions for a destination.</definition></category>
  <category><name>user_name</name><definition>Requests asking what the user’s name is or how to address them.</definition></category>
  <category><name>calories</name><definition>Questions asking for the calorie content of foods or beverages.</definition></category>
  <category><name>reminder_update</name><definition>Requests to set or modify reminders.</definition></category>
  <category><name>time</name><definition>Queries requesting the current time locally or for a specified location.</definition></category>
  <category><name>nutrition_info</name><definition>Requests for detailed nutrition facts of foods (beyond calories only).</definition></category>
  <category><name>current_location</name><definition>Requests to provide the user's current geographic location.</definition></category>
  <category><name>smart_home</name><definition>Requests to control, monitor, or query smart home devices.</definition></category>
  <category><name>book_flight</name><definition>Requests to search for or book flights.</definition></category>
  <category><name>credit_limit_change</name><definition>Requests to change a credit card's credit limit.</definition></category>
  <category><name>weather</name><definition>Questions requesting current weather conditions for a location.</definition></category>
  <category><name>tell_joke</name><definition>Queries asking the assistant to tell a joke.</definition></category>
  <category><name>jump_start</name><definition>Requests for instructions to jump-start a car battery.</definition></category>
  <category><name>pin_change</name><definition>Requests to change or reset a bank card/account PIN.</definition></category>
  <category><name>what_is_your_name</name><definition>Questions asking the assistant's name.</definition></category>
  <category><name>thank_you</name><definition>User expresses gratitude or appreciation.</definition></category>
  <category><name>meaning_of_life</name><definition>Queries asking about the meaning or purpose of life.</definition></category>
  <category><name>gas</name><definition>Questions asking about a vehicle's current fuel level.</definition></category>
  <category><name>traffic</name><definition>Requests for traffic conditions on specific routes/times.</definition></category>
  <category><name>improve_credit_score</name><definition>Queries seeking strategies to increase the user's credit score.</definition></category>
  <category><name>last_maintenance</name><definition>Questions asking when maintenance/service was last performed.</definition></category>
  <category><name>accept_reservations</name><definition>Inquiries about whether a business accepts reservations.</definition></category>
  <category><name>goodbye</name><definition>User expresses a farewell or ends the conversation.</definition></category>
  <category><name>alarm</name><definition>Requests to set, modify, or cancel an alarm.</definition></category>
  <category><name>income</name><definition>Questions about the user's income or salary information.</definition></category>
  <category><name>where_are_you_from</name><definition>Queries asking about a person's place of origin.</definition></category>
  <category><name>pto_request_status</name><definition>Queries asking for the status of a submitted PTO request.</definition></category>
  <category><name>order_status</name><definition>Questions about tracking the current status and delivery updates of an order.</definition></category>
  <category><name>spending_history</name><definition>Requests for a summary or total of past spending over a period.</definition></category>
  <category><name>bill_balance</name><definition>User queries seeking the amount owed on a bill or statement.</definition></category>
  <category><name>new_card</name><definition>Questions about applying for or eligibility for a new credit card.</definition></category>
  <category><name>expiration_date</name><definition>Queries asking for the expiration date of a card or document.</definition></category>
  <category><name>reminder</name><definition>Requests to view or retrieve the user's existing reminders.</definition></category>
  <category><name>sync_device</name><definition>Requests to sync or pair the assistant with a user's device.</definition></category>
  <category><name>who_do_you_work_for</name><definition>Questions asking who the assistant works for or is controlled by.</definition></category>
  <category><name>pay_bill</name><definition>Requests to pay a bill for a specific payee.</definition></category>
  <category><name>bill_due</name><definition>Queries asking for a bill's due date.</definition></category>
  <category><name>cancel</name><definition>Requests to cancel, stop, or abort an ongoing action.</definition></category>
  <category><name>cancel_reservation</name><definition>Requests to cancel an existing reservation or booking.</definition></category>
  <category><name>make_call</name><definition>Requests to initiate a phone call to a contact or number.</definition></category>
  <category><name>share_location</name><definition>Requests to share the user's location with a contact or service.</definition></category>
  <category><name>shopping_list</name><definition>Requests to view the current items on the user's shopping list.</definition></category>
  <category><name>who_made_you</name><definition>Queries asking which person or company made the assistant.</definition></category>
  <category><name>travel_notification</name><definition>Requests to notify a bank about upcoming travel dates and destinations.</definition></category>
  <category><name>what_song</name><definition>Requests to identify or name a song.</definition></category>
  <category><name>mpg</name><definition>Questions asking for a vehicle's fuel economy (miles per gallon).</definition></category>
  <category><name>freeze_account</name><definition>Requests to freeze or lock an account to prevent transactions.</definition></category>
  <category><name>greeting</name><definition>User utterances that initiate conversation (e.g., hi, hello).</definition></category>
  <category><name>order</name><definition>User intent to place or reorder goods or services.</definition></category>
  <category><name>roll_dice</name><definition>Requests to roll a die/dice and report the result.</definition></category>
  <category><name>damaged_card</name><definition>Queries about reporting a damaged card and requesting a replacement.</definition></category>
  <category><name>date</name><definition>Queries asking for the current or a specific calendar date.</definition></category>
  <category><name>how_busy</name><definition>Queries asking how busy or crowded a place is at a given time.</definition></category>
  <category><name>todo_list_update</name><definition>Requests to modify an existing to-do list.</definition></category>
  <category><name>redeem_rewards</name><definition>Queries about redeeming or using reward/credit points.</definition></category>
  <category><name>vaccines</name><definition>Questions about required or recommended travel vaccinations.</definition></category>
  <category><name>are_you_a_bot</name><definition>Questions asking if the assistant is a bot or AI.</definition></category>
  <category><name>fun_fact</name><definition>Queries asking for a fun or surprising fact about a topic.</definition></category>
  <category><name>application_status</name><definition>Questions asking for the status of a submitted application.</definition></category>
  <category><name>account_blocked</name><definition>Requests about an account being blocked or locked and how to resolve it.</definition></category>
  <category><name>book_hotel</name><definition>Requests to search for or book hotel/accommodation stays.</definition></category>
  <category><name>directions</name><definition>Requests for step-by-step navigation directions.</definition></category>
  <category><name>restaurant_reservation</name><definition>Requests to make a restaurant table booking.</definition></category>
  <category><name>restaurant_reviews</name><definition>Requests for ratings/opinions/reviews about restaurants.</definition></category>
  <category><name>repeat</name><definition>Requests to repeat the previous message or last thing the assistant said.</definition></category>
  <category><name>transfer</name><definition>Requests to move money between accounts.</definition></category>
  <category><name>yes</name><definition>Affirmative confirmation or consent absent another explicit task.</definition></category>
  <category><name>no</name><definition>Negative response or refusal absent another explicit task.</definition></category>
  <category><name>maybe</name><definition>Uncertain or ambiguous confirmation absent another explicit task.</definition></category>
  <category><name>unknown</name><definition>Fallback when none of the provided labels clearly apply and "unknown" is included in the instance list.</definition></category>
  <!-- Additional frequently used labels -->
  <category><name>car_rental</name><definition>Requests to search for or book a rental car.</definition></category>
  <category><name>carry_on</name><definition>Questions about airline carry-on baggage allowances or rules.</definition></category>
  <category><name>ingredients_list</name><definition>Requests for the list of ingredients needed for a specific dish or product.</definition></category>
  <category><name>restaurant_suggestion</name><definition>Requests for restaurant recommendations for a location or cuisine.</definition></category>
  <category><name>report_lost_card</name><definition>Requests to report a lost or stolen card and get a replacement.</definition></category>
  <category><name>todo_list</name><definition>Requests to view or retrieve the user's current to-do list.</definition></category>
  <category><name>oil_change_how</name><definition>Requests for instructions on how to perform an oil change.</definition></category>
  <category><name>oil_change_when</name><definition>Questions about when to change a vehicle’s oil (intervals/conditions).</definition></category>
  <category><name>international_visa</name><definition>Questions about visa requirements for travel to a specified country.</definition></category>
  <category><name>next_holiday</name><definition>Questions asking when the next public holiday is.</definition></category>
  <category><name>meeting_schedule</name><definition>Requests to outline or review a meeting schedule/agenda (not creating calendar events).</definition></category>
  <category><name>food_last</name><definition>Questions about how long a food item lasts or remains safe to eat.</definition></category>
  <category><name>cook_time</name><definition>Questions about the time required to cook a specific dish.</definition></category>
  <category><name>direct_deposit</name><definition>Requests to set up, update, or check status of direct deposit.</definition></category>
  <category><name>exchange_rate</name><definition>Questions about the currency exchange rate between two currencies.</definition></category>
  <category><name>insurance_change</name><definition>Requests to change or update insurance coverage/policy details.</definition></category>
  <category><name>meal_suggestion</name><definition>Requests for meal ideas or what to cook, often based on preferences or ingredients.</definition></category>
  <category><name>pto_request</name><definition>Requests to submit a new PTO/day-off request.</definition></category>
  <category><name>rollover_401k</name><definition>Requests about rolling over a 401(k) to another account.</definition></category>
  <category><name>tire_change</name><definition>Requests for instructions to change a flat tire.</definition></category>
</category_definitions>

User message template

# Role and Objective
Classify the user’s query into exactly one intent label from the instance’s <instance_categories> list.

# Instructions
- Use only labels from the list in <instance_categories> (Input section). Do not invent labels.
- Base your decision solely on the content in <question>. Ignore prior conversation unless explicitly included in the input.
- Use canonical definitions from the system context when available. If a label in <instance_categories> lacks a definition, infer its meaning from the label name and common usage.
- Prefer the most specific applicable label over broader ones.
- If multiple intents are present, select the primary explicit request. If still tied, choose the most specific. If still tied, select the first applicable label by the order provided in <instance_categories>.
- If none clearly apply and the label "unknown" is present in <instance_categories>, output unknown; otherwise, choose the closest reasonable match from the provided list.
- Handle typos, slang, and multilingual inputs; still output the label exactly as written in <instance_categories>.
- Do not output your reasoning or any explanation.

## Disambiguation Notes
- time vs date vs timezone:
  - time = current time for a location; date = calendar date (today/tomorrow/specific date); timezone = the time zone of a location.
- what_song vs play_music:
  - what_song = identify/name a song; play_music = play a song/playlist/genre.
- current_location vs share_location:
  - current_location = “Where am I?”; share_location = “Send my location to X.”
- uber vs car_rental vs make_call:
  - uber = hail/book a rideshare; car_rental = rent a vehicle; make_call = place a phone call.
- travel_alert vs weather vs travel_suggestion:
  - travel_alert = advisories/restrictions/safety/entry requirements; weather = conditions/forecast; travel_suggestion = attractions/activities.
- greeting/thank_you/goodbye vs task intents:
  - If only a salutation/thanks/farewell is present, choose greeting/thank_you/goodbye respectively. If combined with a clear task request, choose the task label.
- yes/no/maybe vs task intents:
  - If only an affirmation/negation/uncertainty is present, choose yes/no/maybe respectively (only if present). If combined with a clear task request, choose the task label.
- roll_dice vs flip_coin:
  - roll_dice = roll die/dice; flip_coin = coin toss.
- balance vs bill_balance vs transactions vs spending_history vs min_payment vs bill_due:
  - balance = bank account balances; bill_balance = amount owed on a bill; transactions = list/filter of past transactions; spending_history = summary/total over a period; min_payment = minimum due; bill_due = due date.
- calendar vs schedule_meeting vs calendar_update vs reminder vs alarm vs timer:
  - schedule_meeting = create/schedule a meeting; calendar = check availability/events; calendar_update = modify/delete events; reminder = task/event note; alarm = clock-time alert; timer = countdown.
- order vs order_status vs order_checks:
  - order = place/reorder goods/services; order_status = track an order; order_checks = order paper checks.
- pay_bill vs transfer:
  - pay_bill = pay an external biller; transfer = move funds between accounts.
- report_lost_card vs damaged_card vs freeze_account vs report_fraud:
  - report_lost_card = lost/stolen card; damaged_card = physically damaged card; freeze_account = prevent further transactions; report_fraud = unauthorized charges/activity.
- restaurant_suggestion vs travel_suggestion:
  - restaurant_suggestion = find places to eat; travel_suggestion = things to do/see.
- ingredients_list vs recipe vs ingredient_substitution vs cook_time vs food_last:
  - ingredients_list = list ingredients for a specific dish/product; recipe = full preparation steps; ingredient_substitution = swap one ingredient; cook_time = how long to cook; food_last = how long a food stays good.

# Reasoning Steps (internal only)
1) Parse and normalize the query (handle typos/slang; detect language).
2) Compare against labels/definitions; shortlist candidates.
3) Apply specificity and tie-break rules (primary explicit request; then specificity; then list order).
4) If none clearly apply and "unknown" is present, select unknown; else choose the closest applicable label from the provided list.
5) Self-check: Verify the chosen label exactly matches one entry in <instance_categories> (case and spelling). Do not output anything else.

# Output Format
- Return only a single category label string.
- The label must exactly match one entry in <instance_categories> (exact spelling and case).
- Do not include quotes, punctuation, extra words, or surrounding whitespace.

# Input
<instance_categories>
{categories}
</instance_categories>

<question>
{question}
</question>

# Few-shot Examples
<example>
  <instance_categories>["accept_reservations", "account_blocked", "alarm", "application_status", "apr", "are_you_a_bot", "balance", "bill_balance", "bill_due", "book_flight", "book_hotel", "calculator", "calendar", "calendar_update", "calories", "cancel", "cancel_reservation", "car_rental", "card_declined", "carry_on", "change_accent", "change_ai_name", "change_language", "change_speed", "change_user_name", "change_volume", "confirm_reservation", "cook_time", "credit_limit", "credit_limit_change", "credit_score", "current_location", "damaged_card", "date", "definition", "direct_deposit", "directions", "distance", "do_you_have_pets", "exchange_rate", "expiration_date", "find_phone", "flight_status", "flip_coin", "food_last", "freeze_account", "fun_fact", "gas", "gas_type", "goodbye", "greeting", "how_busy", "how_old_are_you", "improve_credit_score", "income", "ingredient_substitution", "ingredients_list", "insurance", "insurance_change", "interest_rate", "international_fees", "international_visa", "jump_start", "last_maintenance", "lost_luggage", "make_call", "maybe", "meal_suggestion", "meaning_of_life", "measurement_conversion", "meeting_schedule", "min_payment", "mpg", "new_card", "next_holiday", "next_song", "no", "nutrition_info", "oil_change_how", "oil_change_when", "order", "order_checks", "order_status", "pay_bill", "payday", "pin_change", "play_music", "plug_type", "pto_balance", "pto_request", "pto_request_status", "pto_used", "recipe", "redeem_rewards", "reminder", "reminder_update", "repeat", "replacement_card_duration", "report_fraud", "report_lost_card", "reset_settings", "restaurant_reservation", "restaurant_reviews", "restaurant_suggestion", "rewards_balance", "roll_dice", "rollover_401k", "routing", "schedule_maintenance", "schedule_meeting", "share_location", "shopping_list", "shopping_list_update", "smart_home", "spelling", "spending_history", "sync_device", "taxes", "tell_joke", "text", "thank_you", "time", "timer", "timezone", "tire_change", "tire_pressure", "todo_list", "todo_list_update", "traffic", "transactions", "transfer", "translate", "travel_alert", "travel_notification", "travel_suggestion", "uber", "update_playlist", "user_name", "vaccines", "w2", "weather", "what_are_your_hobbies", "what_can_i_ask_you", "what_is_your_name", "what_song", "where_are_you_from", "whisper_mode", "who_do_you_work_for", "who_made_you", "yes"]</instance_categories>
  <input>how much money have a spent in the last week</input>
  <output>spending_history</output>
</example>

<example>
  <instance_categories>["alarm", "timer"]</instance_categories>
  <input>start a stopwatch for 5 minutes</input>
  <output>timer</output>
</example>

<example>
  <instance_categories>["balance", "bill_balance", "translate", "weather"]</instance_categories>
  <input>¿Cuál es mi saldo?</input>
  <output>balance</output>
</example>

<example>
  <instance_categories>["alarm", "calendar", "reminder"]</instance_categories>
  <input>plz set an alram for 7 am</input>
  <output>alarm</output>
</example>

<example>
  <instance_categories>["balance", "translate", "unknown"]</instance_categories>
  <input>what’s your favorite movie?</input>
  <output>unknown</output>
</example>

<example>
  <instance_categories>["yes", "balance", "unknown"]</instance_categories>
  <input>yes</input>
  <output>yes</output>
</example>

# Final Reminder
Only output a single label from the list in <instance_categories>, exactly matching its spelling and case, with no extra text or punctuation.

Why it matters: Manually rewriting and testing prompts for smaller models is time-consuming and unpredictable, and you never know if the effort will pay off. Prompt Adaptation automates that process by surfacing optimized prompts that make smaller models competitive with premium ones, often matching or even outperforming them.

‍

Optimizing prompts for weaker models is especially important in applications built from multiple chained prompts, where cost and latency compounds at every step. Prompt Adaptation reduces both while maintaining or even improving accuracy.

‍

If you’re evaluating models and want to explore this tradeoff, reach out for access or book time here.

‍

Evaluation based on an 200-sample subset of clinc150, a conversational intent classification dataset spanning 150 intents in 10 domains plus out-of-scope queries.