Different versions create separate cache entries, allowing you to roll back without losing savings
You must version prompts because caching requires unique identifiers
What would be the worst approach for maximizing prompt caching savings?
Using the same cache breakpoint markers consistently
Keeping system prompts consistent across requests
Placing user questions before system prompts
Caching large reference documents
What is the relationship between cache breakpoints and provider documentation?
Cache breakpoints are automatically detected by all AI providers
There is no such thing as cache breakpoints in prompt caching
Cache breakpoints must be explicitly marked according to each provider's documentation
Cache breakpoints are only relevant for free-tier users
What occurs when the provider-defined TTL (time-to-live) for a cached prompt expires?
The cached content must be reprocessed on the next request, at full price
The user receives a notification that caching has expired
The system deletes the prompt and sends an error notification
The cache automatically moves to long-term storage
A developer sends the same system prompt followed by a different user question every minute for 10 minutes. How many times will the system prompt portion be charged at full price?
Only the first time
Once at the beginning and once at the end
Never, because questions are different
Every time because questions differ
What does the '90% discount' specifically apply to in prompt caching?
The number of requests you can make
The total API bill including non-cached tokens
Only the cached tokens themselves, not the entire request cost