Skip to main content
March 17, 2026

Anthropic Claude API Pricing Guide 2026: Opus, Sonnet & Haiku Costs Compared

Complete guide to Anthropic Claude API pricing in 2026. Compare costs for Claude Opus 4.6, Sonnet 4.6, and Haiku 4.5 with per-task calculations, pricing history, and tips to cut your Claude bill.

anthropicclaudepricingguide2026comparison
Anthropic Claude API Pricing Guide 2026: Opus, Sonnet & Haiku Costs Compared

Anthropic Claude API Pricing Guide 2026: Opus, Sonnet & Haiku Costs Compared

Anthropic runs three model tiers — Opus for maximum intelligence, Sonnet for the sweet spot, and Haiku for speed and cost. But the pricing landscape has shifted dramatically over the past year. Opus used to cost $15/$75 per million tokens. The latest Opus 4.6 costs $5/$25. That's a 67% price drop for a model that's significantly smarter than its predecessors.

This guide breaks down every Claude model's pricing, shows you what real-world tasks actually cost, and helps you pick the right tier for your workload. All numbers come directly from Anthropic's current API pricing, verified against our AI API pricing database.

✅ TL;DR: Claude Opus 4.6 at $5/$25/M is the best value flagship model Anthropic has ever released. Sonnet 4.6 at $3/$15/M remains the go-to for most production workloads. Haiku 4.5 at $1/$5/M handles high-volume tasks where speed matters more than reasoning depth.


Current Claude model pricing at a glance

Here's every actively available Claude model with current pricing as of March 2026:

Model Input $/M tokens Output $/M tokens Context window Max output Release date
Claude Opus 4.6 $5.00 $25.00 1,000,000 128,000 Feb 2026
Claude Sonnet 4.6 $3.00 $15.00 1,000,000 65,536 Feb 2026
Claude Haiku 4.5 $1.00 $5.00 200,000 64,000 Oct 2025
Claude Opus 4.5 $5.00 $25.00 200,000 64,000 Nov 2025
Claude Sonnet 4.5 $3.00 $15.00 200,000 64,000 Sep 2025
Claude 3.5 Haiku $0.80 $4.00 200,000 8,192 Nov 2024
Claude 3.5 Sonnet $3.00 $15.00 200,000 8,192 Oct 2024
Claude 3 Opus $15.00 $75.00 200,000 4,096 Mar 2024

The three bolded models are where you should focus. Everything else is either legacy or niche.

💡 Key Takeaway: Anthropic's pricing structure is simple compared to OpenAI's sprawling model lineup. Three tiers, clear use cases. Opus for hard problems, Sonnet for everything else, Haiku for bulk work.


Claude Opus 4.6: The flagship at a fraction of its former price

Claude Opus 4.6 is Anthropic's most intelligent model, designed for complex reasoning, agentic workflows, and coding tasks that require deep understanding of large codebases. At $5.00 input / $25.00 output per million tokens, it's remarkably affordable for a flagship.

[stat] 67% How much cheaper Opus 4.6 is compared to the original Claude 3 Opus ($15/$75 → $5/$25)

What Opus 4.6 gives you

  • 1 million token context window — process entire codebases, legal documents, or research papers in a single call
  • 128,000 token max output — generate long-form content, complete implementations, or detailed analyses without hitting output limits
  • Vision capabilities — analyze images, diagrams, screenshots alongside text
  • Advanced reasoning — multi-step problem solving, mathematical proofs, strategic planning

Real-world cost per task with Opus 4.6

Let's calculate what common tasks actually cost:

Task Input tokens Output tokens Cost per task
Code review (500-line file + context) ~8,000 ~2,000 $0.09
Long document analysis (50 pages) ~40,000 ~3,000 $0.28
Complex coding task (multi-file) ~15,000 ~8,000 $0.28
Research synthesis (10 papers) ~80,000 ~5,000 $0.53
Full codebase analysis (100K context) ~100,000 ~10,000 $0.75

At these prices, a developer making 50 complex coding requests per day spends roughly $14/day or $280/month. That's premium-tier intelligence at a cost that won't wreck a startup's budget.

When to use Opus 4.6

Use Opus when the task requires genuine reasoning depth. If you're building an AI agent that needs to plan multi-step actions, reviewing a complex pull request with subtle bugs, or synthesizing information across a massive context window — Opus is worth the premium over Sonnet.

Don't use Opus for classification tasks, simple Q&A, or content formatting. You're paying 67% more than Sonnet for capabilities you won't use on those workloads.


Claude Sonnet 4.6: The production workhorse

Sonnet 4.6 sits at $3.00 input / $15.00 output per million tokens — the same price Anthropic has held for the Sonnet tier since Claude 3.5. But the model behind that price tag has gotten dramatically better. Sonnet 4.6 includes computer-use capabilities, frontier-level coding, and reasoning that would have been flagship-tier a year ago.

$3.00/$15.00
Claude Sonnet 4.6 (1M context)
vs
$2.50/$15.00
GPT-5.4 (272K context)

What Sonnet 4.6 gives you

  • 1 million token context window — same as Opus, a major upgrade from Sonnet 4.5's 200K
  • 65,536 token max output — sufficient for virtually all production tasks
  • Computer use — the only Sonnet-tier model with native screen interaction capabilities
  • Vision + code — strong multimodal performance at mid-tier pricing

Real-world cost per task with Sonnet 4.6

Task Input tokens Output tokens Cost per task
Customer support response ~1,500 ~500 $0.01
Content generation (blog post) ~2,000 ~4,000 $0.07
Code generation (single function) ~3,000 ~1,500 $0.03
Document summarization (10 pages) ~8,000 ~1,000 $0.04
Data extraction from PDFs ~12,000 ~2,000 $0.07

A SaaS product handling 10,000 customer support interactions per day at the above token profile spends roughly $100/day or $3,000/month on Sonnet 4.6. For most companies, that's a rounding error compared to the value of automated, high-quality responses.

Sonnet vs Opus: When the extra $2/$10 per million matters

The decision between Sonnet and Opus comes down to task complexity. Run this mental test: if a junior developer could handle the task with clear instructions, Sonnet is fine. If you'd hand it to your most senior engineer because it requires nuanced judgment — reach for Opus.

Specific triggers to upgrade to Opus:

  • Tasks requiring reasoning across 50K+ tokens of context
  • Multi-step agentic workflows where a wrong intermediate step cascades into failure
  • Code refactoring across multiple interdependent files
  • Any task where a 5% quality improvement translates to meaningful business value

📊 Quick Math: Switching from Opus to Sonnet on 1 million output tokens saves you $10. At 100K output tokens per day, that's $1/day or $30/month. The savings are real but modest — don't sacrifice quality to save $30/month.


Claude Haiku 4.5: Speed and cost at scale

Haiku 4.5 is Anthropic's budget option at $1.00 input / $5.00 output per million tokens. It's built for high-volume, latency-sensitive workloads where you need a capable model at minimal cost.

What Haiku 4.5 gives you

  • 200,000 token context window — smaller than the 4.6 models but still large enough for most tasks
  • 64,000 token max output — generous for a budget model
  • Vision capabilities — image understanding at budget pricing
  • Fastest response times — noticeably quicker than Sonnet or Opus

Real-world cost per task with Haiku 4.5

Task Input tokens Output tokens Cost per task
Text classification ~500 ~50 $0.001
Sentiment analysis ~800 ~100 $0.001
Simple Q&A (RAG) ~2,000 ~300 $0.004
Email triage/routing ~1,000 ~200 $0.002
Data formatting/extraction ~3,000 ~1,000 $0.008

At these prices, processing 1 million classification tasks (500 input + 50 output tokens each) costs approximately $750. That's $0.00075 per classification. For comparison, the same workload on Opus 4.6 would cost $3,750 — five times more for a task that doesn't benefit from superior reasoning.

💡 Key Takeaway: Haiku's sweet spot is high-volume tasks where the model needs to be good enough, not perfect. Classification, routing, extraction, formatting — these don't need Opus-level intelligence.

The legacy option: Claude 3.5 Haiku

Claude 3.5 Haiku still exists at $0.80/$4.00 per million tokens — 20% cheaper than Haiku 4.5. If you're running a pure cost-optimization play on simple tasks and the 8,192 max output limit doesn't bother you, 3.5 Haiku saves you a bit more. But the quality jump to Haiku 4.5 is substantial, and the extra $0.20/$1.00 per million tokens is worth it for most workloads.


How Claude pricing compares to competitors

The question everyone asks: how does Anthropic's pricing stack up against OpenAI, Google, and the open-source alternatives?

Flagship tier comparison

Model Input $/M Output $/M Context Category
Claude Opus 4.6 $5.00 $25.00 1M Flagship
GPT-5.4 $2.50 $15.00 272K Flagship
GPT-5.4 Pro $30.00 $180.00 1M Premium
Gemini 3.1 Pro $2.00 $12.00 1M Balanced
GPT-5.2 Pro $21.00 $168.00 1M Premium

Claude Opus 4.6 lands between GPT-5.4 and GPT-5.4 Pro. It's 2x the price of base GPT-5.4 but offers a 1 million token context window versus GPT-5.4's 272K. If you need large context and top-tier intelligence, Opus 4.6 is actually cheaper than GPT-5.4 Pro by a factor of 6x on input and 7.2x on output.

Google's Gemini 3.1 Pro undercuts everyone at $2/$12 with 1M context, but benchmark comparisons consistently show Claude and GPT models outperforming Gemini on coding and complex reasoning tasks. You get what you pay for.

$5.00/$25.00
Claude Opus 4.6 (1M context, 128K output)
vs
$30.00/$180.00
GPT-5.4 Pro (1M context, 131K output)

Mid-tier comparison

Model Input $/M Output $/M Context Category
Claude Sonnet 4.6 $3.00 $15.00 1M Balanced
GPT-5.4 $2.50 $15.00 272K Flagship
GPT-5.1 $1.25 $10.00 1M Standard
Gemini 3 Flash $0.50 $3.00 1M Efficient

Sonnet 4.6 and GPT-5.4 are nearly identical on price. The deciding factor is context window (1M vs 272K) and which model performs better on your specific task. For coding tasks, both are strong contenders.

Budget tier comparison

Model Input $/M Output $/M Context Category
Claude Haiku 4.5 $1.00 $5.00 200K Efficient
GPT-5 mini $0.25 $2.00 500K Efficient
GPT-5 nano $0.05 $0.40 128K Nano
DeepSeek V3.2 $0.28 $0.42 128K Efficient
Mistral Small 3.2 $0.06 $0.18 128K Efficient

This is where Anthropic struggles on price. Haiku 4.5 is 4x more expensive than GPT-5 mini and 17x more expensive than Mistral Small 3.2 on output. DeepSeek V3.2 is absurdly cheap at $0.28/$0.42 — making Haiku look premium by comparison.

If your workload is pure volume (millions of simple requests), Anthropic isn't the cheapest option. But Haiku 4.5's quality-per-dollar is competitive — it handles complex instructions better than most budget models, which means fewer retries and better results.

⚠️ Warning: Don't pick a model solely on price-per-token. A model that costs 5x less but requires 3 retries per task to get acceptable output is actually more expensive. Factor in quality-adjusted cost, not just raw token price.


Claude pricing history: The trend is your friend

One of the most remarkable stories in AI pricing is how Anthropic's Opus tier has evolved:

Model Release Input $/M Output $/M Relative cost
Claude 3 Opus Mar 2024 $15.00 $75.00 Baseline
Claude Opus 4 May 2025 $15.00 $75.00 Same
Claude Opus 4.1 Aug 2025 $15.00 $75.00 Same
Claude Opus 4.5 Nov 2025 $5.00 $25.00 -67%
Claude Opus 4.6 Feb 2026 $5.00 $25.00 -67%

The Opus line held at $15/$75 for over a year before dropping to $5/$25 with the 4.5 release. That price has held steady through 4.6. Meanwhile, context windows jumped from 200K to 1 million tokens and max output went from 4,096 to 128,000 tokens. You're getting a radically better model at a third of the original price.

Sonnet pricing has been remarkably stable — $3/$15 since Claude 3.5 Sonnet in October 2024. Anthropic appears to treat $3/$15 as their anchor price point for the mid-tier, improving quality within that price bracket rather than adjusting the price itself.

📊 Quick Math: If you were spending $1,000/month on Claude 3 Opus in early 2025, the same workload on Opus 4.6 costs $333/month — and you get a better model with 5x the context window and 32x the max output.


Cost optimization strategies for Claude

1. Use prompt caching

Anthropic supports prompt caching, which can cut input costs by up to 90% for repeated system prompts and context. If you're sending the same 10K-token system prompt with every request, caching means you only pay full price once — subsequent requests charge a fraction.

This is especially impactful for Opus 4.6 users with large system prompts or RAG applications that reuse the same document chunks across multiple queries.

2. Route by complexity

Don't send everything to one model. Build a routing layer that classifies incoming requests and sends them to the appropriate tier:

  • Simple queries (classification, extraction, formatting) → Haiku 4.5
  • Standard tasks (content generation, code assistance, summarization) → Sonnet 4.6
  • Complex reasoning (multi-step analysis, agentic workflows, large context) → Opus 4.6

A well-tuned router can cut your average cost per request by 40-60% while maintaining quality where it matters.

3. Optimize your prompts

Output tokens cost 5x more than input tokens across all Claude models. Every unnecessary word the model generates costs you money. Be specific about desired output format and length:

  • "Respond in 3 bullet points" vs. leaving it open-ended
  • "Return only the JSON object" vs. "Explain your analysis and return the JSON"
  • Set max_tokens appropriately — don't leave it at the default if you only need a short response

4. Batch API for non-urgent workloads

Anthropic's Batch API offers 50% off standard pricing for workloads that can tolerate up to 24 hours of latency. If you're doing nightly data processing, bulk content generation, or background analysis — batch pricing turns Sonnet into Haiku-level costs:

Model Standard output $/M Batch output $/M
Opus 4.6 $25.00 $12.50
Sonnet 4.6 $15.00 $7.50
Haiku 4.5 $5.00 $2.50

💡 Key Takeaway: Combine routing + caching + batching and you can realistically cut your Claude API bill by 50-70% without changing models or sacrificing quality.


Which Claude model should you choose?

Here's the decision framework:

Choose Haiku 4.5 ($1/$5) when:

  • Processing more than 100K requests per day
  • Task doesn't require complex reasoning (classification, routing, formatting)
  • Latency is critical (real-time user-facing features)
  • Budget is extremely tight

Choose Sonnet 4.6 ($3/$15) when:

  • Building a production application with diverse task types
  • You need strong coding assistance, content generation, or analysis
  • Context up to 1M tokens is valuable for your use case
  • You want the best quality-to-cost ratio

Choose Opus 4.6 ($5/$25) when:

  • Tasks involve multi-step reasoning or complex agentic workflows
  • You're working with very large contexts (500K+ tokens)
  • Quality difference between Sonnet and Opus materially impacts your outcomes
  • The $2/$10 premium per million tokens over Sonnet is trivial relative to the value generated

For most teams, Sonnet 4.6 is the default choice. It handles 80% of workloads at a price that scales. Use Haiku for the high-volume simple stuff, Opus for the hard stuff, and you've got an efficient multi-tier setup.

Use our AI cost calculator to model your specific usage pattern and see exactly what your monthly bill will look like across different Claude models.


Frequently asked questions

How much does Claude API cost per message?

A typical chatbot message (1,500 input tokens, 500 output tokens) costs approximately $0.01 on Sonnet 4.6, $0.02 on Opus 4.6, and $0.004 on Haiku 4.5. Actual costs depend on message length and conversation history included in the context. Use our cost-per-task calculator for precise estimates.

Is Claude Opus 4.6 worth the price over Sonnet 4.6?

For most tasks, no — Sonnet 4.6 delivers 90%+ of Opus quality at 60% of the cost. Opus 4.6 justifies its premium on complex multi-step reasoning, large-context analysis (500K+ tokens), and agentic workflows where intermediate errors compound. If you're building a coding agent or doing deep research synthesis, the upgrade pays for itself.

Why is Claude more expensive than DeepSeek or Mistral?

Anthropic's models are proprietary and hosted on their infrastructure, which means higher operational costs. DeepSeek V3.2 at $0.28/$0.42 and Mistral Small 3.2 at $0.06/$0.18 are dramatically cheaper, but they don't match Claude on complex reasoning, instruction following, or coding tasks. The right comparison is quality-adjusted cost per successful task, not raw token price. See our cheapest AI APIs guide for the full breakdown.

Does Anthropic offer volume discounts?

Anthropic offers committed-use discounts for high-volume customers through their sales team. The Batch API provides an automatic 50% discount for workloads that can tolerate 24-hour latency, which is available to all API users without special arrangements. Prompt caching is another built-in cost reduction that doesn't require negotiation.

What's the cheapest way to use Claude in production?

Combine three strategies: (1) Route requests to the cheapest model that can handle each task — Haiku for simple work, Sonnet for standard, Opus for complex. (2) Enable prompt caching to avoid re-paying for repeated system prompts and context. (3) Use Batch API for any workload that isn't time-sensitive. Together, these can reduce your effective Claude costs by 50-70% compared to sending everything to a single model without optimization.