Claude Pricing Calculator
Estimate Anthropic Claude API cost across Opus 4.7, Sonnet 4.6, Haiku 4.5, and the Claude 3 line. Includes prompt-cache pricing (90% discount once your cacheable prefix is ≥1024 tokens for Sonnet/Haiku, 2048 for Opus). Updated against Anthropic's list prices.
Cost across all Claude models for this workload
How to use the Claude Pricing Calculator
Pick a Claude model. "Cached prefix tokens" is the portion of your input that's stable across calls (system prompt, tool definitions, large RAG context). Anthropic's prompt caching gives a 90% discount on those tokens after the first call within the cache window (5 min idle TTL by default, up to 1 hour with the longer-TTL header).
Claude caching minimums and ergonomics
Cache minimums: 1024 tokens for Sonnet 4.6 and Haiku 4.5; 2048 tokens for Opus 4.7. Below those, caching is silently skipped — no discount, no error.
Cache TTL is 5 minutes idle by default (refreshed on each cache hit). For longer pipelines where calls cluster but cache may expire between batches, the extended-cache-ttl-2025-04-11 beta header extends to 1 hour at a small premium.
Practical pattern: put the stable bits (system prompt, tool defs, large context) first in the message, then the per-call user message last. Anthropic caches contiguous prefixes — anything that varies per call belongs at the end.