Which AI model gives the most tokens per dollar?

GPT-5 Nano gives the most input tokens per dollar at 20,000,000 based on its $0.05 per 1M input pricing. For output-heavy tasks, Mistral Small 3.2 leads many major models at 5,555,556 output tokens per dollar from its $0.18 output price.

How many tokens is 1000 words?

The post estimates 1,000 words as roughly 1,300 tokens. Using that conversion, 2,000 words is about 2,700 tokens and an 80,000-word novel is around 107,000 tokens.

Is GPT-5 cheaper per token than Claude?

Compared with Claude Sonnet 4.6, GPT-5 is cheaper on input at $1.00 versus $3.00 per 1M tokens. On output, GPT-5 is also cheaper at $8.00 versus $15.00, which translates to 125,000 vs 66,667 output tokens per dollar.

Published February 24, 2026Updated March 29, 2026

How Many AI Tokens Can You Get for $1? Every Major Model Compared

$1 buys 20,000,000 tokens on GPT-5 Nano but just 47,619 on GPT-5.2 Pro — a 420× difference. Every major model ranked.

pricingtokenscomparison2026cost-optimization

How Many AI Tokens Can You Get for $1? Every Major Model Compared

A dollar buys you 20 million tokens on one model and 13,000 on another. That's a 1,500x difference in raw throughput for the same budget. If you're not checking the tokens-per-dollar math before picking a model, you're probably overspending versus today's cheapest AI APIs.

This guide calculates exactly how many tokens each major AI model gives you for $1 — both input and output — using current 2026 pricing from our broader AI API pricing guide. We'll cover the budget champions, the premium flagships, and the sweet-spot models that balance cost with capability.

✅ TL;DR: GPT-5 Nano gives you 20M input tokens per dollar. GPT-5.2 Pro gives you 47,619. The cheapest model isn't always the best value — capability per dollar matters more than raw token count.

How we calculated tokens per dollar

The math is straightforward: divide 1,000,000 by the price per million tokens. That gives you tokens per dollar.

For a model charging $0.10 per million input tokens:

1,000,000 ÷ 0.10 = 10,000,000 tokens per dollar

For output tokens at $0.40 per million:

1,000,000 ÷ 0.40 = 2,500,000 tokens per dollar

Every number in this article uses live pricing from our AI cost calculator. We update pricing weekly as providers change rates.

The budget tier: 10M+ input tokens per dollar

These models give you the most raw throughput. They're ideal for high-volume tasks where you need to process massive amounts of text cheaply — classification, extraction, summarization, embeddings preprocessing.

[stat] 20,000,000 Input tokens per $1 with GPT-5 Nano — the most tokens-per-dollar of any major model

GPT-5 Nano — $0.05 input / $0.40 output per 1M tokens

Input: 20,000,000 tokens per $1
Output: 2,500,000 tokens per $1
OpenAI's smallest model. Good for classification, routing, and simple extraction. Not suitable for complex reasoning.

Mistral Small 3.2 — $0.06 input / $0.18 output per 1M tokens

Input: 16,666,667 tokens per $1
Output: 5,555,556 tokens per $1
Best output-tokens-per-dollar in this tier. If your workload is output-heavy (generation, rewriting), Mistral Small punches above its weight.

Gemini 2.0 Flash Lite — $0.075 input / $0.30 per 1M tokens

Input: 13,333,333 tokens per $1
Output: 3,333,333 tokens per $1
Google's lightweight option with a massive 1M context window. Ideal for processing long documents on a budget.

GPT-4.1 Nano — $0.10 input / $0.40 output per 1M tokens

Input: 10,000,000 tokens per $1
Output: 2,500,000 tokens per $1
Previous-gen nano model. Still solid for simple tasks but GPT-5 Nano is cheaper and newer.

💡 Key Takeaway: Budget models give you 10-20 million input tokens per dollar. The tradeoff is capability — these models struggle with nuanced reasoning, complex code generation, and multi-step planning. Use them for the 80% of tasks that don't need a premium brain.

The mid-range tier: 1M–10M input tokens per dollar

This is where most production workloads live. These models balance cost with genuine capability — they can handle customer support, content generation, code assistance, and structured data extraction without breaking the bank.

Gemini 2.5 Flash — $0.15 input / $0.60 output per 1M tokens

Input: 6,666,667 tokens per $1
Output: 1,666,667 tokens per $1
Google's workhorse. 1M context window and strong reasoning for the price. Hard to beat for document-heavy workloads.

GPT-4o Mini — $0.15 input / $0.60 output per 1M tokens

Input: 6,666,667 tokens per $1
Output: 1,666,667 tokens per $1
Same tokens-per-dollar as Gemini 2.5 Flash but with a 128K context limit. Excellent for chat and short-form tasks.

GPT-5 Mini — $0.25 input / $2.00 output per 1M tokens

Input: 4,000,000 tokens per $1
Output: 500,000 tokens per $1
Notice the gap: input is cheap but output is 4x more expensive per token. If your use case generates long responses, GPT-5 Mini gets expensive fast.

[vs] 6.67M tokens | Gemini 2.5 Flash (input) || 500K tokens | GPT-5 Mini (output) Same dollar, wildly different value depending on whether you're reading or writing

DeepSeek V3.2 — $0.28 input / $0.42 output per 1M tokens

Input: 3,571,429 tokens per $1
Output: 2,380,952 tokens per $1
DeepSeek's balanced pricing means the input/output gap is small. Great for conversational workloads where you're both reading and generating roughly equally.

Llama 4 Maverick — $0.27 input / $0.85 output per 1M tokens

Input: 3,703,704 tokens per $1
Output: 1,176,471 tokens per $1
Meta's open model via Together AI. Self-hosting drops costs further, but you take on infrastructure complexity.

GPT-4.1 Mini — $0.40 input / $1.60 output per 1M tokens

Input: 2,500,000 tokens per $1
Output: 625,000 tokens per $1
Solid all-rounder with 200K context. The 4:1 input/output price ratio is typical for OpenAI models.

Claude Haiku 4.5 — $1.00 input / $5.00 output per 1M tokens

Input: 1,000,000 tokens per $1
Output: 200,000 tokens per $1
Anthropic's cheapest current model. One million tokens per dollar sounds reasonable until you compare it to the budget tier above. You're paying for Anthropic's safety training and instruction-following quality.

💡 Key Takeaway: Mid-range models vary 6x in tokens-per-dollar (6.67M down to 1M for input). The input/output price ratio matters more than the headline price — a model that's cheap on input but expensive on output will surprise you on generation-heavy workloads, which is why we track both views in our cost-per-million ranking.

The premium tier: under 1M input tokens per dollar

These are the flagship models. You're not buying tokens — you're buying intelligence. Complex reasoning, nuanced writing, advanced code generation, and multi-step problem solving. Use them strategically on tasks that actually need the capability.

Claude Sonnet 4.6 — $3.00 input / $15.00 output per 1M tokens

Input: 333,333 tokens per $1
Output: 66,667 tokens per $1
Anthropic's balanced flagship. 1M context window. The go-to for production applications that need quality without Opus pricing.

GPT-5 — $1.00 input / $8.00 output per 1M tokens

Input: 1,000,000 tokens per $1
Output: 125,000 tokens per $1
OpenAI's flagship is actually competitive on input pricing. The output cost is where it gets expensive. Compare carefully with Claude Sonnet 4.6.

Claude Opus 4.6 — $5.00 input / $25.00 output per 1M tokens

Input: 200,000 tokens per $1
Output: 40,000 tokens per $1
The premium thinking model. Best-in-class for complex analysis and creative work. 5x more expensive than Sonnet per token.

Grok 4 — $3.00 input / $15.00 output per 1M tokens

Input: 333,333 tokens per $1
Output: 66,667 tokens per $1
xAI's flagship matches Claude Sonnet on pricing. The differentiator is real-time data access and a different personality profile.

[stat] 40,000 Output tokens per $1 with Claude Opus 4.6 — that's roughly 30,000 words of generated text for a dollar

The reasoning tier: the most expensive tokens

Reasoning models (o-series, DeepSeek R1) generate internal "thinking" tokens that you pay for. These models solve harder problems but the token economics are different — a single complex query can consume 50,000+ thinking tokens.

o4-mini — $1.10 input / $4.40 output per 1M tokens

Input: 909,091 tokens per $1
Output: 227,273 tokens per $1
The budget reasoning option. Thinking tokens are billed at the output rate, so a reasoning-heavy query might generate 20K thinking tokens + 2K visible output = 22K tokens billed at $4.40/M.

o3 — $2.00 input / $8.00 output per 1M tokens

Input: 500,000 tokens per $1
Output: 125,000 tokens per $1
Full reasoning model. Heavy thinking tasks (math proofs, complex code debugging) can cost $0.10-0.50 per query.

o3-pro — $20.00 input / $80.00 output per 1M tokens

Input: 50,000 tokens per $1
Output: 12,500 tokens per $1
The most expensive mainstream model. A single complex reasoning query can cost $1-5. Reserve this for tasks where correctness is worth any price — medical analysis, legal reasoning, complex financial modeling.

GPT-5.2 Pro — $21.00 input / $168.00 output per 1M tokens

Input: 47,619 tokens per $1
Output: 5,952 tokens per $1
The priciest model in our database. Output tokens cost 28x more than Claude Sonnet. This is for mission-critical reasoning where the margin of error must be near zero.

⚠️ Warning: Reasoning model pricing is deceptive. The per-token rate looks manageable until you realize a single query can generate 10-100K thinking tokens. Always estimate total tokens (input + thinking + output) before committing to a reasoning model in production.

Input vs output: why the ratio matters

Most developers focus on the input price. That's a mistake. The input/output price ratio varies wildly across models and determines which workloads are actually cheap.

Output-heavy workloads (content generation, code writing, long-form answers):

Best value: Mistral Small 3.2 — 3.3:1 input/output ratio
Worst value: GPT-5.2 Pro — 8:1 ratio

Input-heavy workloads (classification, extraction, summarization of long docs):

Best value: GPT-5 Nano — 20M tokens per dollar
Context window matters: Gemini 2.0 Flash Lite handles 1M tokens natively

Balanced workloads (chatbots, Q&A, customer support):

Best value: DeepSeek V3.2 — nearly equal input/output rates
Runner-up: Llama 3.1 8B — flat $0.18 both ways

📊 Quick Math: A chatbot processing 1,000 conversations/day averaging 2,000 input + 500 output tokens each:

DeepSeek V3.2: (2M × $0.28 + 0.5M × $0.42) / 1M = $0.77/day ($23/month)
Claude Sonnet 4.6: (2M × $3.00 + 0.5M × $15.00) / 1M = $13.50/day ($405/month)
Same workload, 17x cost difference.

Real-world scenarios: what $100/month buys you

Let's make this concrete. Here's what $100 per month gets you across different models and use cases.

Customer support chatbot

Average conversation: 3,000 input tokens, 800 output tokens

Model	Conversations per $100	Cost per conversation
GPT-5 Nano	~588,000	$0.00017
Gemini 2.5 Flash	~128,000	$0.00078
GPT-5 Mini	~45,000	$0.0022
Claude Sonnet 4.6	~5,700	$0.018
Claude Opus 4.6	~2,600	$0.038

Blog post generation

Average post: 500 input tokens (prompt), 4,000 output tokens

Model	Posts per $100	Cost per post
Mistral Small 3.2	~133,000	$0.00075
GPT-5 Mini	~12,400	$0.008
Claude Sonnet 4.6	~1,650	$0.060
GPT-5.2 Pro	~148	$0.672

Code review agent

Average task: 15,000 input tokens (code + context), 3,000 output tokens

Model	Reviews per $100	Cost per review
DeepSeek V3.2	~18,500	$0.0054
GPT-4.1	~2,900	$0.034
Claude Opus 4.6	~1,052	$0.095

💡 Key Takeaway: The gap between cheapest and most expensive is 100-1,000x for the same task. Start with the cheapest model that produces acceptable quality, then upgrade only the tasks that need it; our best budget AI models list is a good starting point. Most production systems should use 2-3 models at different tiers.

The strategy: tiered model routing

The smartest teams don't pick one model. They route requests to the cheapest model that can handle each task.

Tier 1 — Bulk processing (GPT-5 Nano, Mistral Small 3.2)

Classification, tagging, simple extraction
10-20M tokens per dollar
Handle 80% of requests

Tier 2 — Standard tasks (Gemini 2.5 Flash, GPT-5 Mini, DeepSeek V3.2)

Customer support, content generation, code assistance
1-6M tokens per dollar
Handle 15% of requests

Tier 3 — Complex reasoning (Claude Sonnet 4.6, GPT-5, o3)

Multi-step analysis, creative writing, hard debugging
100K-500K tokens per dollar
Handle 5% of requests

A well-designed routing system cuts costs 60-80% compared to running everything through a flagship model. Use our multi-model comparison tool to find the right mix for your workload.

Frequently asked questions

How many tokens is a typical word? One token is roughly 0.75 words in English. So 1,000 tokens ≈ 750 words. A 2,000-word blog post is about 2,700 tokens. A full novel (~80,000 words) is roughly 107,000 tokens.

Do cached/prompt-cached tokens change the math? Yes, significantly. OpenAI and Anthropic offer 50-90% discounts on cached input tokens. If you're sending the same system prompt repeatedly (common in production), your effective input cost drops dramatically. DeepSeek offers cache hits at $0.07/M — that's 14.3M cached tokens per dollar.

Are output tokens always more expensive than input? Almost always. The exception is Llama 3.1 models via Together AI, which charge flat rates for both. For most providers, output costs 3-8x more than input because generation requires sequential computation while input processing can be parallelized.

Should I switch models to save money? Only if quality stays acceptable. Run an eval first: take 100 representative queries, run them through both models, and compare output quality. A 10x cheaper model that produces 20% worse results might cost you more in user churn than you save on API bills.

What's the cheapest way to process millions of documents? Use OpenAI's Batch API for 50% off, or self-host Llama 4 Maverick. For extraction tasks, GPT-5 Nano at $0.05/M input tokens processes 1 million 1,000-token documents for $50.

Pricing data current as of March 2026. Check our AI Cost Calculator for real-time pricing and run your own cost estimates. Prices change frequently — we update weekly.

Related Cost Guides

Keep going with the closest pricing and optimization guides in this cluster.

How Many AI Tokens Can You Get for $1? Every Major Model Compared

How we calculated tokens per dollar

The budget tier: 10M+ input tokens per dollar

The mid-range tier: 1M–10M input tokens per dollar

The premium tier: under 1M input tokens per dollar

The reasoning tier: the most expensive tokens

Input vs output: why the ratio matters

Real-world scenarios: what $100/month buys you

Customer support chatbot

Blog post generation

Code review agent

The strategy: tiered model routing

Frequently asked questions

Related Cost Guides

Cheapest AI Model for Every Task: April 2026 Buyer's Guide

AI API Cost Per Word: What Every Model Actually Charges for Generated Text in 2026

AI Model Tiers Explained: Nano, Mini, Standard, and Pro Pricing Guide for 2026