Anthropic gives you three tiers of Claude. They range from $1 to $25 per million output tokens. Picking the wrong one means either burning money on overkill or getting subpar results that cost you in other ways.
Here's how to choose — with real pricing data, cost calculations for common workloads, and a model routing strategy that can cut your Anthropic bill by 50–70%.
[stat] 5× The cost difference between Claude Haiku 4.5 ($5/M output) and Claude Opus 4.6 ($25/M output) — for the same number of tokens
The lineup at a glance
Claude Opus 4.6 — $5.00 input / $25.00 output per million tokens. 200K context. The flagship. Best reasoning, best writing, most expensive.
Claude Sonnet 4.6 — $3.00 input / $15.00 output per million tokens. 1M context. The workhorse. Strong performance at 60% of Opus output cost. The only Claude model with computer use capabilities.
Claude Haiku 4.5 — $1.00 input / $5.00 output per million tokens. 200K context. The budget pick. Fast, cheap, good enough for most structured tasks.
For reference, the older Claude 3 Opus sits at $15/$75 — so the current Opus 4.6 is already a 3× price cut from the original flagship. And Claude 3.5 Haiku at $0.80/$4.00 is even cheaper if you don't need the latest capabilities.
💡 Key Takeaway: Anthropic's pricing has dropped dramatically. Claude Opus 4.6 at $5/$25 delivers better performance than Claude 3 Opus did at $15/$75. If you haven't re-evaluated your model choices recently, you're probably overpaying.
Price per request
Token counts vary by task, but let's use realistic averages. A typical API call sends ~800 input tokens and receives ~400 output tokens.
| Model | Cost per request | Monthly (10K requests) | Monthly (100K requests) |
|---|---|---|---|
| Claude Opus 4.6 | $0.014 | $140 | $1,400 |
| Claude Sonnet 4.6 | $0.0084 | $84 | $840 |
| Claude Haiku 4.5 | $0.0028 | $28 | $280 |
| Claude 3.5 Haiku | $0.0022 | $22 | $224 |
Opus costs 5× more than Haiku per request. At 100K requests/month, that's a $1,120 difference. The question is whether that gap buys you anything your application actually needs.
When to use each tier
Claude Opus 4.6: Complex reasoning and high-stakes output
Opus earns its price tag on tasks where reasoning depth matters. Multi-step analysis, nuanced writing, complex code generation, and anything where a wrong answer is costly.
Pick Opus when:
- You're building an AI coding assistant that needs to handle architectural decisions
- Legal or medical analysis where precision is non-negotiable
- Creative writing that needs to feel genuinely good, not just grammatically correct
- Multi-step reasoning chains where each step builds on the last
- Agent workflows that require careful planning and tool orchestration
Skip Opus when the task has a clear, structured answer. It's overkill for extraction, classification, or templated responses. You're paying 5× more for reasoning capabilities you aren't using.
The 200K context window is sufficient for most use cases, but if you're processing very long documents, Sonnet's 1M context may force the choice regardless of quality preference.
Claude Sonnet 4.6: The default choice
Sonnet is where most production workloads should land. It handles summarization, Q&A, content generation, and moderate reasoning tasks without the Opus premium. The 1M context window is a major differentiator — it's the only Claude tier that goes that high.
Pick Sonnet when:
- You need to process very long documents (Sonnet's 1M context is 5× larger than the others)
- General-purpose chatbots and assistants
- Code generation for standard tasks (plus Sonnet 4.6 has computer use capabilities)
- Content drafting and editing
- RAG pipelines where quality matters but the retrieval does most of the heavy lifting
- You need the best balance of cost and capability
Sonnet is 40% cheaper than Opus on both input and output. For most applications, the quality difference is marginal — particularly on structured tasks where clear instructions guide the output.
📊 Quick Math: Switching from Opus to Sonnet on a 50K request/month workload (2,000 input + 500 output tokens each) saves $450/month — from $775 to $325. That's $5,400/year for a quality difference most users won't notice.
Claude Haiku 4.5: Volume and speed
Haiku is built for scale. Classification, entity extraction, simple Q&A, routing, and any task where you're making thousands of calls and the individual response doesn't need to be brilliant.
Pick Haiku when:
- Classifying support tickets or emails
- Extracting structured data from text (JSON output, entity recognition)
- Building a routing layer that decides which model handles a request
- High-volume, low-complexity pipelines
- Prototyping and development (why pay Opus prices while debugging prompts?)
- Generating embeddings for search and retrieval preprocessing
At $1.00/$5.00 per million tokens, Haiku is competitive within the Claude ecosystem but not the cheapest option overall. GPT-5 Nano at $0.05/$0.40 and Mistral Small 3.2 at $0.06/$0.18 are significantly cheaper — see our budget model comparison for the full picture.
⚠️ Warning: Don't default to Opus "just to be safe." At 100K requests/month, the Opus-to-Haiku cost difference is $1,120/month — that's $13,440/year. Start with Haiku, measure quality, and upgrade only the tasks that genuinely need it.
Three real scenarios with detailed calculations
Scenario 1: Customer support chatbot
A SaaS company handles 50,000 support conversations per month. Average conversation: 2,000 input tokens (context + history), 500 output tokens.
| Model | Input Cost | Output Cost | Monthly Total |
|---|---|---|---|
| Opus 4.6 | $500 | $625 | $1,125 |
| Sonnet 4.6 | $300 | $375 | $675 |
| Haiku 4.5 | $100 | $125 | $225 |
Most support queries follow patterns — password resets, billing questions, feature explanations. Haiku handles 80% of them perfectly. Route the remaining 20% (complex billing disputes, technical debugging) to Sonnet.
Blended cost with routing: (40K × Haiku) + (10K × Sonnet) = $180 + $135 = $315/month. That's 72% less than running everything on Opus and only $90 more than pure Haiku — for noticeably better handling of complex cases.
Scenario 2: Legal document analysis
A law firm processes 500 contracts per month. Each contract: 15,000 input tokens, 2,000 output tokens. Accuracy is critical — a missed clause costs real money.
| Model | Input Cost | Output Cost | Monthly Total |
|---|---|---|---|
| Opus 4.6 | $37.50 | $25.00 | $62.50 |
| Sonnet 4.6 | $22.50 | $15.00 | $37.50 |
| Haiku 4.5 | $7.50 | $5.00 | $12.50 |
At only 500 requests/month, the absolute dollar difference is small. The cost of a missed liability clause dwarfs the $50 gap between Haiku and Opus. Use Opus. This is exactly what it's for — low-volume, high-stakes work where reasoning quality directly impacts business outcomes.
Scenario 3: Content pipeline
A media company generates 200 article drafts per day (6,000/month). Each draft: 1,000 input tokens, 1,500 output tokens.
| Model | Input Cost | Output Cost | Monthly Total |
|---|---|---|---|
| Opus 4.6 | $30 | $225 | $255 |
| Sonnet 4.6 | $18 | $135 | $153 |
| Haiku 4.5 | $6 | $45 | $51 |
Content quality matters, but Sonnet writes nearly as well as Opus for standard articles. Use Sonnet for first drafts, then have a human editor polish. Save Opus for thought leadership pieces where the writing needs to genuinely impress — maybe 10% of output.
Blended approach: (5,400 × Sonnet) + (600 × Opus) = $138 + $26 = $164/month. Only $11 more than pure Sonnet, but your best content gets the Opus treatment.
[stat] $810/month The savings from using a Haiku+Sonnet routing strategy versus pure Opus for a 50K request/month support chatbot
How Claude compares to alternatives
Anthropic's tiers aren't the only option. Here's how they stack up against the competition at each tier:
| Tier | Claude | OpenAI Alternative | Google Alternative |
|---|---|---|---|
| Budget | Haiku 4.5 ($1/$5) | GPT-5 nano ($0.05/$0.40) | Gemini 2.5 Flash-Lite ($0.10/$0.40) |
| Mid | Sonnet 4.6 ($3/$15) | GPT-5 ($1.25/$10) | Gemini 2.5 Pro ($1.25/$10) |
| Premium | Opus 4.6 ($5/$25) | GPT-5.2 ($1.75/$14) | Gemini 3 Pro ($2/$12) |
The pricing gap is stark. GPT-5 nano is 20× cheaper on input and 12.5× cheaper on output than Haiku 4.5. If you're pure cost-optimizing, Claude loses at every tier. But pricing isn't everything — read our OpenAI vs Anthropic deep-dive for quality trade-offs.
If you want the cheapest option regardless of provider, check our cheapest AI APIs ranking.
The smart strategy: model routing
Don't pick one tier. Use all three.
Build a simple classifier (Haiku is great for this) that examines each incoming request and routes it:
- Haiku handles simple, structured tasks — 60–70% of traffic
- Sonnet handles moderate complexity — 25–30% of traffic
- Opus handles the hard stuff — 5–10% of traffic
This blended approach typically cuts costs by 50–70% compared to running everything on Opus, with minimal quality impact. The router itself costs almost nothing — a Haiku classification call on a short input is fractions of a cent.
Implementation tip: Start by routing everything to Sonnet. Log requests and responses. After a week, identify patterns: which requests produce identical quality on Haiku? Which ones clearly need Opus? Use those patterns to build your routing rules. You don't need an ML classifier — simple keyword matching and request-length heuristics work surprisingly well.
For teams already using prompt caching, route cached requests to cheaper models. The cache reduces input cost; combining it with model routing reduces output cost. The savings compound.
✅ TL;DR: Start with Sonnet as your default — it's the safest bet at $3/$15 with a 1M context window. Drop to Haiku ($1/$5) for high-volume structured tasks. Upgrade to Opus ($5/$25) only for complex reasoning and high-stakes output. Model routing across all three tiers saves 50–70% versus pure Opus.
Bottom line
Start with Sonnet. It's the safest default. Good quality, reasonable price, massive context window.
Drop to Haiku for high-volume, structured tasks where speed and cost matter more than nuance.
Upgrade to Opus only for tasks where reasoning depth demonstrably improves outcomes — and where the volume is low enough that the premium doesn't compound into a painful bill.
Run the numbers for your specific workload. Try the AI Cost Check calculator to compare exact costs across all Claude tiers and their competitors. For a head-to-head between Anthropic and OpenAI, check our full pricing comparison.
Frequently asked questions
Which Claude model is best for most developers?
Claude Sonnet 4.6 at $3.00/$15.00 per million tokens is the best default choice. It offers strong performance on coding, writing, and analysis tasks at 40% less than Opus. The 1M context window is the largest in the Claude lineup, making it versatile for both short and long-context workloads. Start here and only move to Opus if you can measure a quality improvement that justifies the 2.5× cost increase.
How much cheaper is Claude Haiku than Opus?
Claude Haiku 4.5 costs $1.00/$5.00 per million tokens, while Claude Opus 4.6 costs $5.00/$25.00. That's 5× cheaper on both input and output. At scale, the difference is dramatic: 100K requests/month costs $280 on Haiku versus $1,400 on Opus — a savings of $1,120/month or $13,440/year. Use our calculator to model your exact workload.
Can I mix Claude models in the same application?
Yes, and you should. Model routing — sending different request types to different tiers — is the most effective cost optimization strategy. Use Haiku for simple tasks (classification, extraction), Sonnet for standard work (chat, summarization), and Opus for complex reasoning. A basic routing strategy cuts costs by 50–70% versus using a single tier for everything.
Is Claude Opus 4.6 worth the premium over Sonnet?
For most workloads, no. Sonnet 4.6 handles 90%+ of tasks at comparable quality for 40% less cost. Opus justifies its premium in specific scenarios: complex multi-step reasoning, nuanced creative writing, high-stakes analysis where errors are expensive, and sophisticated agent workflows. If your volume is low (under 1,000 requests/month), the absolute cost difference is small enough that Opus is fine as a default.
How does Claude Haiku compare to GPT-5 Nano?
Claude Haiku 4.5 ($1.00/$5.00) is significantly more expensive than GPT-5 Nano ($0.05/$0.40) — 20× on input and 12.5× on output. However, Haiku produces higher-quality output for tasks requiring nuance, tone, and reasoning. For pure extraction and classification, GPT-5 Nano is the better deal. For customer-facing responses where quality matters, Haiku's premium may be justified. See our budget model roundup for a full comparison.
