Anthropic Claude API Pricing Guide 2026: Opus, Sonnet & Haiku Costs Compared
Anthropic runs three model tiers — Opus for maximum intelligence, Sonnet for the sweet spot, and Haiku for speed and cost. But the pricing landscape has shifted dramatically over the past year. Opus used to cost $15/$75 per million tokens. The latest Opus 4.6 costs $5/$25. That's a 67% price drop for a model that's significantly smarter than its predecessors.
This guide breaks down every Claude model's pricing, shows you what real-world tasks actually cost, and helps you pick the right tier for your workload. All numbers come directly from Anthropic's current API pricing, verified against our AI API pricing database.
✅ TL;DR: Claude Opus 4.6 at $5/$25/M is the best value flagship model Anthropic has ever released. Sonnet 4.6 at $3/$15/M remains the go-to for most production workloads. Haiku 4.5 at $1/$5/M handles high-volume tasks where speed matters more than reasoning depth.
Current Claude model pricing at a glance
Here's every actively available Claude model with current pricing as of March 2026:
| Model | Input $/M tokens | Output $/M tokens | Context window | Max output | Release date |
|---|---|---|---|---|---|
| Claude Opus 4.6 | $5.00 | $25.00 | 1,000,000 | 128,000 | Feb 2026 |
| Claude Sonnet 4.6 | $3.00 | $15.00 | 1,000,000 | 65,536 | Feb 2026 |
| Claude Haiku 4.5 | $1.00 | $5.00 | 200,000 | 64,000 | Oct 2025 |
| Claude Opus 4.5 | $5.00 | $25.00 | 200,000 | 64,000 | Nov 2025 |
| Claude Sonnet 4.5 | $3.00 | $15.00 | 200,000 | 64,000 | Sep 2025 |
| Claude 3.5 Haiku | $0.80 | $4.00 | 200,000 | 8,192 | Nov 2024 |
| Claude 3.5 Sonnet | $3.00 | $15.00 | 200,000 | 8,192 | Oct 2024 |
| Claude 3 Opus | $15.00 | $75.00 | 200,000 | 4,096 | Mar 2024 |
The three bolded models are where you should focus. Everything else is either legacy or niche.
💡 Key Takeaway: Anthropic's pricing structure is simple compared to OpenAI's sprawling model lineup. Three tiers, clear use cases. Opus for hard problems, Sonnet for everything else, Haiku for bulk work.
Claude Opus 4.6: The flagship at a fraction of its former price
Claude Opus 4.6 is Anthropic's most intelligent model, designed for complex reasoning, agentic workflows, and coding tasks that require deep understanding of large codebases. At $5.00 input / $25.00 output per million tokens, it's remarkably affordable for a flagship.
[stat] 67% How much cheaper Opus 4.6 is compared to the original Claude 3 Opus ($15/$75 → $5/$25)
What Opus 4.6 gives you
- 1 million token context window — process entire codebases, legal documents, or research papers in a single call
- 128,000 token max output — generate long-form content, complete implementations, or detailed analyses without hitting output limits
- Vision capabilities — analyze images, diagrams, screenshots alongside text
- Advanced reasoning — multi-step problem solving, mathematical proofs, strategic planning
Real-world cost per task with Opus 4.6
Let's calculate what common tasks actually cost:
| Task | Input tokens | Output tokens | Cost per task |
|---|---|---|---|
| Code review (500-line file + context) | ~8,000 | ~2,000 | $0.09 |
| Long document analysis (50 pages) | ~40,000 | ~3,000 | $0.28 |
| Complex coding task (multi-file) | ~15,000 | ~8,000 | $0.28 |
| Research synthesis (10 papers) | ~80,000 | ~5,000 | $0.53 |
| Full codebase analysis (100K context) | ~100,000 | ~10,000 | $0.75 |
At these prices, a developer making 50 complex coding requests per day spends roughly $14/day or $280/month. That's premium-tier intelligence at a cost that won't wreck a startup's budget.
When to use Opus 4.6
Use Opus when the task requires genuine reasoning depth. If you're building an AI agent that needs to plan multi-step actions, reviewing a complex pull request with subtle bugs, or synthesizing information across a massive context window — Opus is worth the premium over Sonnet.
Don't use Opus for classification tasks, simple Q&A, or content formatting. You're paying 67% more than Sonnet for capabilities you won't use on those workloads.
Claude Sonnet 4.6: The production workhorse
Sonnet 4.6 sits at $3.00 input / $15.00 output per million tokens — the same price Anthropic has held for the Sonnet tier since Claude 3.5. But the model behind that price tag has gotten dramatically better. Sonnet 4.6 includes computer-use capabilities, frontier-level coding, and reasoning that would have been flagship-tier a year ago.
What Sonnet 4.6 gives you
- 1 million token context window — same as Opus, a major upgrade from Sonnet 4.5's 200K
- 65,536 token max output — sufficient for virtually all production tasks
- Computer use — the only Sonnet-tier model with native screen interaction capabilities
- Vision + code — strong multimodal performance at mid-tier pricing
Real-world cost per task with Sonnet 4.6
| Task | Input tokens | Output tokens | Cost per task |
|---|---|---|---|
| Customer support response | ~1,500 | ~500 | $0.01 |
| Content generation (blog post) | ~2,000 | ~4,000 | $0.07 |
| Code generation (single function) | ~3,000 | ~1,500 | $0.03 |
| Document summarization (10 pages) | ~8,000 | ~1,000 | $0.04 |
| Data extraction from PDFs | ~12,000 | ~2,000 | $0.07 |
A SaaS product handling 10,000 customer support interactions per day at the above token profile spends roughly $100/day or $3,000/month on Sonnet 4.6. For most companies, that's a rounding error compared to the value of automated, high-quality responses.
Sonnet vs Opus: When the extra $2/$10 per million matters
The decision between Sonnet and Opus comes down to task complexity. Run this mental test: if a junior developer could handle the task with clear instructions, Sonnet is fine. If you'd hand it to your most senior engineer because it requires nuanced judgment — reach for Opus.
Specific triggers to upgrade to Opus:
- Tasks requiring reasoning across 50K+ tokens of context
- Multi-step agentic workflows where a wrong intermediate step cascades into failure
- Code refactoring across multiple interdependent files
- Any task where a 5% quality improvement translates to meaningful business value
📊 Quick Math: Switching from Opus to Sonnet on 1 million output tokens saves you $10. At 100K output tokens per day, that's $1/day or $30/month. The savings are real but modest — don't sacrifice quality to save $30/month.
Claude Haiku 4.5: Speed and cost at scale
Haiku 4.5 is Anthropic's budget option at $1.00 input / $5.00 output per million tokens. It's built for high-volume, latency-sensitive workloads where you need a capable model at minimal cost.
What Haiku 4.5 gives you
- 200,000 token context window — smaller than the 4.6 models but still large enough for most tasks
- 64,000 token max output — generous for a budget model
- Vision capabilities — image understanding at budget pricing
- Fastest response times — noticeably quicker than Sonnet or Opus
Real-world cost per task with Haiku 4.5
| Task | Input tokens | Output tokens | Cost per task |
|---|---|---|---|
| Text classification | ~500 | ~50 | $0.001 |
| Sentiment analysis | ~800 | ~100 | $0.001 |
| Simple Q&A (RAG) | ~2,000 | ~300 | $0.004 |
| Email triage/routing | ~1,000 | ~200 | $0.002 |
| Data formatting/extraction | ~3,000 | ~1,000 | $0.008 |
At these prices, processing 1 million classification tasks (500 input + 50 output tokens each) costs approximately $750. That's $0.00075 per classification. For comparison, the same workload on Opus 4.6 would cost $3,750 — five times more for a task that doesn't benefit from superior reasoning.
💡 Key Takeaway: Haiku's sweet spot is high-volume tasks where the model needs to be good enough, not perfect. Classification, routing, extraction, formatting — these don't need Opus-level intelligence.
The legacy option: Claude 3.5 Haiku
Claude 3.5 Haiku still exists at $0.80/$4.00 per million tokens — 20% cheaper than Haiku 4.5. If you're running a pure cost-optimization play on simple tasks and the 8,192 max output limit doesn't bother you, 3.5 Haiku saves you a bit more. But the quality jump to Haiku 4.5 is substantial, and the extra $0.20/$1.00 per million tokens is worth it for most workloads.
How Claude pricing compares to competitors
The question everyone asks: how does Anthropic's pricing stack up against OpenAI, Google, and the open-source alternatives?
Flagship tier comparison
| Model | Input $/M | Output $/M | Context | Category |
|---|---|---|---|---|
| Claude Opus 4.6 | $5.00 | $25.00 | 1M | Flagship |
| GPT-5.4 | $2.50 | $15.00 | 272K | Flagship |
| GPT-5.4 Pro | $30.00 | $180.00 | 1M | Premium |
| Gemini 3.1 Pro | $2.00 | $12.00 | 1M | Balanced |
| GPT-5.2 Pro | $21.00 | $168.00 | 1M | Premium |
Claude Opus 4.6 lands between GPT-5.4 and GPT-5.4 Pro. It's 2x the price of base GPT-5.4 but offers a 1 million token context window versus GPT-5.4's 272K. If you need large context and top-tier intelligence, Opus 4.6 is actually cheaper than GPT-5.4 Pro by a factor of 6x on input and 7.2x on output.
Google's Gemini 3.1 Pro undercuts everyone at $2/$12 with 1M context, but benchmark comparisons consistently show Claude and GPT models outperforming Gemini on coding and complex reasoning tasks. You get what you pay for.
Mid-tier comparison
| Model | Input $/M | Output $/M | Context | Category |
|---|---|---|---|---|
| Claude Sonnet 4.6 | $3.00 | $15.00 | 1M | Balanced |
| GPT-5.4 | $2.50 | $15.00 | 272K | Flagship |
| GPT-5.1 | $1.25 | $10.00 | 1M | Standard |
| Gemini 3 Flash | $0.50 | $3.00 | 1M | Efficient |
Sonnet 4.6 and GPT-5.4 are nearly identical on price. The deciding factor is context window (1M vs 272K) and which model performs better on your specific task. For coding tasks, both are strong contenders.
Budget tier comparison
| Model | Input $/M | Output $/M | Context | Category |
|---|---|---|---|---|
| Claude Haiku 4.5 | $1.00 | $5.00 | 200K | Efficient |
| GPT-5 mini | $0.25 | $2.00 | 500K | Efficient |
| GPT-5 nano | $0.05 | $0.40 | 128K | Nano |
| DeepSeek V3.2 | $0.28 | $0.42 | 128K | Efficient |
| Mistral Small 3.2 | $0.06 | $0.18 | 128K | Efficient |
This is where Anthropic struggles on price. Haiku 4.5 is 4x more expensive than GPT-5 mini and 17x more expensive than Mistral Small 3.2 on output. DeepSeek V3.2 is absurdly cheap at $0.28/$0.42 — making Haiku look premium by comparison.
If your workload is pure volume (millions of simple requests), Anthropic isn't the cheapest option. But Haiku 4.5's quality-per-dollar is competitive — it handles complex instructions better than most budget models, which means fewer retries and better results.
⚠️ Warning: Don't pick a model solely on price-per-token. A model that costs 5x less but requires 3 retries per task to get acceptable output is actually more expensive. Factor in quality-adjusted cost, not just raw token price.
Claude pricing history: The trend is your friend
One of the most remarkable stories in AI pricing is how Anthropic's Opus tier has evolved:
| Model | Release | Input $/M | Output $/M | Relative cost |
|---|---|---|---|---|
| Claude 3 Opus | Mar 2024 | $15.00 | $75.00 | Baseline |
| Claude Opus 4 | May 2025 | $15.00 | $75.00 | Same |
| Claude Opus 4.1 | Aug 2025 | $15.00 | $75.00 | Same |
| Claude Opus 4.5 | Nov 2025 | $5.00 | $25.00 | -67% |
| Claude Opus 4.6 | Feb 2026 | $5.00 | $25.00 | -67% |
The Opus line held at $15/$75 for over a year before dropping to $5/$25 with the 4.5 release. That price has held steady through 4.6. Meanwhile, context windows jumped from 200K to 1 million tokens and max output went from 4,096 to 128,000 tokens. You're getting a radically better model at a third of the original price.
Sonnet pricing has been remarkably stable — $3/$15 since Claude 3.5 Sonnet in October 2024. Anthropic appears to treat $3/$15 as their anchor price point for the mid-tier, improving quality within that price bracket rather than adjusting the price itself.
📊 Quick Math: If you were spending $1,000/month on Claude 3 Opus in early 2025, the same workload on Opus 4.6 costs $333/month — and you get a better model with 5x the context window and 32x the max output.
Cost optimization strategies for Claude
1. Use prompt caching
Anthropic supports prompt caching, which can cut input costs by up to 90% for repeated system prompts and context. If you're sending the same 10K-token system prompt with every request, caching means you only pay full price once — subsequent requests charge a fraction.
This is especially impactful for Opus 4.6 users with large system prompts or RAG applications that reuse the same document chunks across multiple queries.
2. Route by complexity
Don't send everything to one model. Build a routing layer that classifies incoming requests and sends them to the appropriate tier:
- Simple queries (classification, extraction, formatting) → Haiku 4.5
- Standard tasks (content generation, code assistance, summarization) → Sonnet 4.6
- Complex reasoning (multi-step analysis, agentic workflows, large context) → Opus 4.6
A well-tuned router can cut your average cost per request by 40-60% while maintaining quality where it matters.
3. Optimize your prompts
Output tokens cost 5x more than input tokens across all Claude models. Every unnecessary word the model generates costs you money. Be specific about desired output format and length:
- "Respond in 3 bullet points" vs. leaving it open-ended
- "Return only the JSON object" vs. "Explain your analysis and return the JSON"
- Set
max_tokensappropriately — don't leave it at the default if you only need a short response
4. Batch API for non-urgent workloads
Anthropic's Batch API offers 50% off standard pricing for workloads that can tolerate up to 24 hours of latency. If you're doing nightly data processing, bulk content generation, or background analysis — batch pricing turns Sonnet into Haiku-level costs:
| Model | Standard output $/M | Batch output $/M |
|---|---|---|
| Opus 4.6 | $25.00 | $12.50 |
| Sonnet 4.6 | $15.00 | $7.50 |
| Haiku 4.5 | $5.00 | $2.50 |
💡 Key Takeaway: Combine routing + caching + batching and you can realistically cut your Claude API bill by 50-70% without changing models or sacrificing quality.
Which Claude model should you choose?
Here's the decision framework:
Choose Haiku 4.5 ($1/$5) when:
- Processing more than 100K requests per day
- Task doesn't require complex reasoning (classification, routing, formatting)
- Latency is critical (real-time user-facing features)
- Budget is extremely tight
Choose Sonnet 4.6 ($3/$15) when:
- Building a production application with diverse task types
- You need strong coding assistance, content generation, or analysis
- Context up to 1M tokens is valuable for your use case
- You want the best quality-to-cost ratio
Choose Opus 4.6 ($5/$25) when:
- Tasks involve multi-step reasoning or complex agentic workflows
- You're working with very large contexts (500K+ tokens)
- Quality difference between Sonnet and Opus materially impacts your outcomes
- The $2/$10 premium per million tokens over Sonnet is trivial relative to the value generated
For most teams, Sonnet 4.6 is the default choice. It handles 80% of workloads at a price that scales. Use Haiku for the high-volume simple stuff, Opus for the hard stuff, and you've got an efficient multi-tier setup.
Use our AI cost calculator to model your specific usage pattern and see exactly what your monthly bill will look like across different Claude models.
Frequently asked questions
How much does Claude API cost per message?
A typical chatbot message (1,500 input tokens, 500 output tokens) costs approximately $0.01 on Sonnet 4.6, $0.02 on Opus 4.6, and $0.004 on Haiku 4.5. Actual costs depend on message length and conversation history included in the context. Use our cost-per-task calculator for precise estimates.
Is Claude Opus 4.6 worth the price over Sonnet 4.6?
For most tasks, no — Sonnet 4.6 delivers 90%+ of Opus quality at 60% of the cost. Opus 4.6 justifies its premium on complex multi-step reasoning, large-context analysis (500K+ tokens), and agentic workflows where intermediate errors compound. If you're building a coding agent or doing deep research synthesis, the upgrade pays for itself.
Why is Claude more expensive than DeepSeek or Mistral?
Anthropic's models are proprietary and hosted on their infrastructure, which means higher operational costs. DeepSeek V3.2 at $0.28/$0.42 and Mistral Small 3.2 at $0.06/$0.18 are dramatically cheaper, but they don't match Claude on complex reasoning, instruction following, or coding tasks. The right comparison is quality-adjusted cost per successful task, not raw token price. See our cheapest AI APIs guide for the full breakdown.
Does Anthropic offer volume discounts?
Anthropic offers committed-use discounts for high-volume customers through their sales team. The Batch API provides an automatic 50% discount for workloads that can tolerate 24-hour latency, which is available to all API users without special arrangements. Prompt caching is another built-in cost reduction that doesn't require negotiation.
What's the cheapest way to use Claude in production?
Combine three strategies: (1) Route requests to the cheapest model that can handle each task — Haiku for simple work, Sonnet for standard, Opus for complex. (2) Enable prompt caching to avoid re-paying for repeated system prompts and context. (3) Use Batch API for any workload that isn't time-sensitive. Together, these can reduce your effective Claude costs by 50-70% compared to sending everything to a single model without optimization.
