What is the cheapest AI API in 2026?

By input price, GPT-5 Nano is the cheapest at $0.05 per 1M input tokens. For combined low-cost general use, the post identifies Mistral Small 3.2 at $0.06 input and $0.18 output per 1M tokens.

How much does the cheapest AI model cost?

The lowest listed input rate is $0.05 per 1M tokens on GPT-5 Nano, while the lowest output rate is $0.18 on Mistral Small 3.2 and Llama 3.1 8B. In a sample 1M input + 200K output workload, Mistral Small 3.2 is estimated at $0.096 total.

Is DeepSeek cheaper than OpenAI?

DeepSeek V3.2 is $0.28/$0.42 per 1M tokens, which is higher than OpenAI GPT-5 Nano at $0.05/$0.40 and GPT-4o mini at $0.15/$0.60 for many budget use cases. But compared to OpenAI flagships like GPT-5.2 at $1.75/$14.00, DeepSeek is dramatically cheaper.

What is the best budget AI model for production?

The guide calls Mistral Small 3.2 the best budget model at $0.06 input and $0.18 output because it has the lowest combined general-purpose cost. In the 50K conversations/day chatbot example, it is estimated at $180/month versus $300 for GPT-5 Nano and $4,200 for Claude Haiku 4.5.

Published February 16, 2026Updated March 21, 2026

The Cheapest AI APIs in 2026: Every Model Ranked by Price

We ranked all 49 AI models across 8 providers by cost per million tokens. From $0.05 to $168 per million output tokens — here's exactly what you'll pay.

pricingcost comparisonbudgetapi

The Cheapest AI APIs in 2026: Every Model Ranked by Price

If you're building with AI in 2026, cost matters. Whether you're prototyping a chatbot, scaling a production app, or just experimenting — the price difference between models is staggering. The cheapest option costs $0.05 per million input tokens. The most expensive? $168. That's a 3,360× difference.

We pulled pricing data from all 8 major providers and ranked every model. No affiliate bias, no sponsored picks — just the numbers from official pricing pages, verified February 2026.

[stat] 3,360× The price gap between GPT-5 Nano ($0.05/M input) and GPT-5.2 pro ($168/M output) — the cheapest and most expensive AI APIs available today

The 10 Cheapest AI APIs Right Now

Here are the most affordable models available via API, ranked by input token price:

Rank	Model	Provider	Input $/1M	Output $/1M	Context
1	GPT-5 Nano	OpenAI	$0.05	$0.40	128K
2	Mistral Small 3.2	Mistral	$0.06	$0.18	128K
3	Gemini 2.0 Flash-Lite	Google	$0.07	$0.30	1M
4	GPT-4.1 nano	OpenAI	$0.10	$0.40	128K
5	Gemini 2.5 Flash-Lite	Google	$0.10	$0.40	1M
6	Llama 3.1 8B	Meta	$0.18	$0.18	128K
7	Gemini 2.5 Flash	Google	$0.15	$0.60	1M
8	GPT-4o mini	OpenAI	$0.15	$0.60	128K
9	Command R	Cohere	$0.15	$0.60	128K
10	Grok 4.1 Fast	xAI	$0.20	$0.50	2M

The standout? Mistral Small 3.2 — at just $0.06 input and $0.18 output per million tokens, it has the lowest combined cost of any general-purpose model on the market. For pure output economy, Llama 3.1 8B at $0.18/$0.18 is unmatched — flat pricing regardless of direction.

💡 Key Takeaway: Don't just compare input prices. Mistral Small 3.2 has the lowest output cost at $0.18/M — that's 3.3× cheaper on output than GPT-4o mini ($0.60/M), which matters enormously for generation-heavy workloads.

Best Value by Category

Price isn't everything. Here's the best deal in each model tier, balancing cost against capability.

Best Budget Model: Mistral Small 3.2 ($0.06/$0.18)

Mistral's compact model offers the lowest combined cost of any general-purpose model. At $0.06 input / $0.18 output per million tokens, it's absurdly cheap for classification, extraction, and simple generation. The 128K context window handles most production workloads. The trade-off: it's less capable than GPT-4o mini on complex reasoning, but for structured tasks, the quality gap is minimal.

Best All-Rounder Under $1: DeepSeek V3.2 ($0.28/$0.42)

DeepSeek V3.2 continues to undercut everyone in its quality class. At $0.28 input / $0.42 output, you get a model with reasoning and coding capabilities that rivals models costing 5–10× more. The 128K context window handles most production workloads. Availability and latency can vary by provider, but for batch processing or latency-tolerant applications, it's hard to beat.

Best Flagship Under $2: GPT-5.2 ($1.75/$14.00)

OpenAI's latest flagship is surprisingly competitive at the top end. With a 1M token context window, vision, audio, and code capabilities — GPT-5.2 is the most capable model you can get for under $2/M input tokens. Compare it to Claude Opus 4.6 at $5/$25 — GPT-5.2 is 2.9× cheaper on input and 1.8× cheaper on output.

$0.18

Mistral Small 3.2 output per 1M

$25.00

Claude Opus 4.6 output per 1M

Best Reasoning Model: o4-mini ($1.10/$4.40)

If you need chain-of-thought reasoning without the o3-pro price tag ($20/$80), o4-mini delivers. It shares the same price as o3-mini but with a massive 2M token context window — the largest of any reasoning model. Be aware that reasoning models generate hidden thinking tokens that increase your actual cost beyond the sticker price.

Best Context Window: Gemini 2.5 Pro ($1.25/$10.00 — 2M tokens)

Google wins the context war. At 2 million tokens, Gemini 2.5 Pro can process entire codebases, books, or document collections in a single call. And at $1.25/M input, it's cheaper than most flagships. If your workload involves processing very long documents, this eliminates the need for chunking and RAG pipelines entirely.

The Real Cost: Input vs Output

Don't just look at input prices. Most AI workloads generate significantly more output tokens relative to their cost — because output tokens are 3–8× more expensive. A model that's cheap on input but expensive on output can cost more overall.

Example: Processing 1M input tokens and generating 200K output tokens:

Model	Input Cost	Output Cost	Total
Mistral Small 3.2	$0.06	$0.036	$0.096
GPT-5 Nano	$0.05	$0.08	$0.13
Gemini 2.5 Flash-Lite	$0.10	$0.08	$0.18
DeepSeek V3.2	$0.28	$0.084	$0.36
GPT-5.2	$1.75	$2.80	$4.55
Claude Opus 4.6	$5.00	$5.00	$10.00

📊 Quick Math: The gap between budget and premium is 100× for the same workload. That's the difference between $9/month and $900/month at moderate usage (processing 1M input + 200K output tokens daily).

Cheapest by Provider

Every provider has a budget option. Here's each one's most affordable model:

Provider	Cheapest Model	Input $/1M	Output $/1M
OpenAI	GPT-5 Nano	$0.05	$0.40
Mistral	Mistral Small 3.2	$0.06	$0.18
Google	Gemini 2.0 Flash-Lite	$0.07	$0.30
Meta	Llama 3.1 8B	$0.18	$0.18
xAI	Grok 4.1 Fast	$0.20	$0.50
DeepSeek	V3.2	$0.28	$0.42
Cohere	Command R	$0.15	$0.60
Anthropic	Claude 3.5 Haiku	$0.80	$4.00

Anthropic is the priciest at the budget end — their cheapest model (Claude 3.5 Haiku at $0.80/M) costs 13× more on input than the cheapest overall (GPT-5 Nano). You're paying for the Claude quality floor. Whether that quality premium is worth it depends on your task — read our OpenAI vs Anthropic comparison for a detailed analysis.

⚠️ Warning: Cheap per-token pricing doesn't account for hidden costs like retries, context window waste, and rate limit overhead. A model that's 10× cheaper but fails 20% of the time may cost more in practice. Read our hidden costs guide before committing to the cheapest option.

Real-World Cost at Scale

Abstract pricing means nothing without context. Here's what common workloads actually cost on the cheapest models:

High-Volume Chatbot (50K conversations/day)

800 input tokens, 400 output tokens per conversation. Monthly: 1.2B input, 600M output tokens.

Model	Monthly Cost
Mistral Small 3.2	$72 + $108 = $180
GPT-5 Nano	$60 + $240 = $300
DeepSeek V3.2	$336 + $252 = $588
Claude Haiku 4.5	$1,200 + $3,000 = $4,200

That's a 23× cost difference between Mistral Small and Claude Haiku for the same chatbot. At $180/month, Mistral Small 3.2 makes AI chatbots viable even for bootstrapped startups.

Document Processing Pipeline (10K documents/day)

4,000 input tokens, 500 output tokens per document. Monthly: 1.2B input, 150M output tokens.

Model	Monthly Cost
GPT-5 Nano	$60 + $60 = $120
Mistral Small 3.2	$72 + $27 = $99
Gemini 2.5 Flash-Lite	$120 + $60 = $180

For input-heavy workloads, GPT-5 Nano's $0.05/M input is the cost floor. But Mistral Small 3.2 wins on total cost because its output pricing ($0.18/M) is so low.

[stat] $99/month The total cost to process 10,000 documents per day using Mistral Small 3.2 — less than most SaaS subscriptions

How to Save Even More

Already on the cheapest model? Here are five more ways to cut costs:

1. Prompt caching

OpenAI, Anthropic, and Google all offer cached input pricing at 50–90% discount. If you're sending similar prompts repeatedly (same system prompt, shared context), this is free money. OpenAI applies it automatically; Anthropic requires explicit cache headers.

2. Batch API

OpenAI's batch endpoint gives 50% off all token costs. If your workload isn't real-time, batch everything. We wrote a complete guide on saving with the Batch API.

3. Shorter prompts

Every token costs money. Strip system prompts to essentials, use concise instructions, avoid redundant context. A 2,000-token system prompt that could be 500 tokens is wasting 75% on input costs every single request.

4. Model routing

Use a cheap model (GPT-5 Nano) for simple tasks and route complex ones to a flagship. Don't use a $14/M output model to format JSON. A basic router cuts costs 40–60% compared to a single model for everything, and this is even clearer in newer comparisons like GPT-5.4 mini vs nano benchmarks.

5. Output limits

Set max_tokens to prevent runaway generation. A model generating 4K tokens when you needed 200 is pure waste. This single parameter can save 50%+ on output costs for applications with predictable response lengths.

For a complete optimization playbook, see our cost optimization strategies guide.

The Bottom Line

The AI API market in 2026 is a buyer's paradise. Models that would have cost $60/M tokens two years ago now have equivalents at $0.10. The key is matching your workload to the right price tier:

High volume, simple tasks → Mistral Small 3.2, GPT-5 Nano, Gemini Flash-Lite ($0.05–0.10/M input)
General purpose, good quality → DeepSeek V3.2, Gemini 2.5 Flash ($0.15–0.28/M input), or the DeepSeek vs Mistral budget showdown
Production flagship → GPT-5.2, Claude Sonnet 4.6, Gemini 3 Pro ($1.75–3.00/M input)
Maximum capability → Claude Opus 4.6, GPT-5.2 pro, o3-pro ($5–21/M input)

✅ TL;DR: Mistral Small 3.2 ($0.06/$0.18) is the cheapest general-purpose model. GPT-5 Nano ($0.05/$0.40) has the cheapest input. Llama 3.1 8B ($0.18/$0.18) has flat pricing. The budget-to-premium gap is 100×. Use model routing and prompt caching to squeeze even more savings.

Use our cost calculator to estimate your exact monthly spend, or check model comparisons to see how any two models stack up side by side. For a detailed look at the best budget options, read our budget model roundup.

Frequently asked questions

What is the cheapest AI API available in 2026?

By input price, GPT-5 Nano at $0.05 per million tokens. By output price, Mistral Small 3.2 at $0.18 per million tokens. By combined cost for a balanced workload, Mistral Small 3.2 ($0.06/$0.18) is the cheapest general-purpose option. For flat input/output pricing, Llama 3.1 8B at $0.18/$0.18 through Together AI is the simplest to budget for.

Are the cheapest AI models good enough for production?

Yes, for structured and well-defined tasks. GPT-5 Nano and Mistral Small 3.2 handle classification, extraction, summarization, and simple Q&A reliably. They struggle with complex reasoning, creative writing, and nuanced analysis. The strategy is model routing: use cheap models for 60–70% of requests and route the rest to mid-tier models.

How much does it cost to run an AI chatbot in 2026?

A chatbot handling 1,000 conversations/day costs roughly $6–180/month depending on the model. On GPT-5 Nano: ~$6/month. On DeepSeek V3.2: ~$20/month. On Claude Sonnet 4.6: ~$180/month. The biggest variable is model choice, not volume. Use our chatbot cost breakdown for detailed calculations.

Why is Anthropic so much more expensive than other providers?

Anthropic's cheapest model (Claude 3.5 Haiku at $0.80/$4.00) is 13× more expensive on input than GPT-5 Nano. This reflects Anthropic's focus on quality and safety rather than price competition. Claude models consistently score higher on nuanced tasks, instruction-following, and writing quality. You pay more per token but may need fewer retries and less prompt engineering. See our OpenAI vs Anthropic comparison.

What hidden costs should I watch for with cheap AI APIs?

The four biggest hidden costs: 1) Failed requests that still bill tokens. 2) Retries that multiply your spend (5% error rate with 2 retries = $6K/month waste at scale). 3) Context window waste from oversized prompts. 4) Reasoning model thinking tokens that don't appear in output but get billed. Budget an extra 30–50% beyond your per-token estimate. Read our hidden costs guide for the full breakdown.

Prices verified February 2026. We update pricing weekly via automated scraping of official provider pages. Compare all models →

Related Comparisons

Related Cost Guides

Keep going with the closest pricing and optimization guides in this cluster.

The Cheapest AI APIs in 2026: Every Model Ranked by Price

The 10 Cheapest AI APIs Right Now

Best Value by Category

Best Budget Model: Mistral Small 3.2 ($0.06/$0.18)

Best All-Rounder Under $1: DeepSeek V3.2 ($0.28/$0.42)

Best Flagship Under $2: GPT-5.2 ($1.75/$14.00)

Best Reasoning Model: o4-mini ($1.10/$4.40)

Best Context Window: Gemini 2.5 Pro ($1.25/$10.00 — 2M tokens)

The Real Cost: Input vs Output

Cheapest by Provider

Real-World Cost at Scale

High-Volume Chatbot (50K conversations/day)

Document Processing Pipeline (10K documents/day)

How to Save Even More

1. Prompt caching

2. Batch API

3. Shorter prompts

4. Model routing

5. Output limits

The Bottom Line

Frequently asked questions

What is the cheapest AI API available in 2026?

Are the cheapest AI models good enough for production?

How much does it cost to run an AI chatbot in 2026?

Why is Anthropic so much more expensive than other providers?

What hidden costs should I watch for with cheap AI APIs?

Related Comparisons

Related Cost Guides

Every AI Model Under $1 Per Million Tokens (May 2026)

xAI Grok Pricing Guide 2026: Every Model, Cost & How to Save

AI Invoice Processing Costs in 2026: Cost Per 1,000 Invoices and the Cheapest Models for AP Automation