OpenAI launched GPT-5.4 mini and nano on March 18, 2026, and these models rewrite the rules on what "small" AI can do. GPT-5.4 mini nearly matches the full GPT-5.4 on coding benchmarks while costing 70% less. GPT-5.4 nano undercuts almost every model on the market at $0.20 per million input tokens — cheaper than a rounding error.
But cheap means nothing if the model can't perform. So let's break down exactly what you get for the money: real pricing math, benchmark comparisons against every major competitor, and specific recommendations for when each model makes sense.
💡 Key Takeaway: GPT-5.4 mini is the new default for production AI workloads that need reasoning without flagship pricing. GPT-5.4 nano is the new floor for classification, extraction, and lightweight agent tasks.
GPT-5.4 mini and nano pricing at a glance
Here's what OpenAI is charging for the new models, alongside the full GPT-5.4 family:
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context Window | Max Output |
|---|---|---|---|---|
| GPT-5.4 Pro | $30.00 | $180.00 | 1,050K | 131,072 |
| GPT-5.4 | $2.50 | $15.00 | 1,050K | 131,072 |
| GPT-5.4 mini | $0.75 | $4.50 | 400K | 32,768 |
| GPT-5.4 nano | $0.20 | $1.25 | 128K | 8,192 |
The pricing tells a clear story. GPT-5.4 mini is 70% cheaper on input and 70% cheaper on output compared to the full GPT-5.4. Nano goes even further — 92% cheaper on input and 91.7% cheaper on output.
For context, GPT-5 mini (the predecessor) cost $0.25 input / $2.00 output. GPT-5.4 mini is 3x more expensive on input but delivers dramatically better performance. Whether that tradeoff makes sense depends entirely on your workload.
[stat] 70% Cost reduction from GPT-5.4 to GPT-5.4 mini with near-flagship coding performance
How GPT-5.4 mini and nano compare to their predecessors
The generational leap here is significant. GPT-5.4 mini doesn't just iterate on GPT-5 mini — it closes the gap with the full GPT-5.4 flagship:
| Benchmark | GPT-5.4 | GPT-5.4 mini | GPT-5.4 nano | GPT-5 mini |
|---|---|---|---|---|
| SWE-Bench Pro | 57.7% | 54.4% | 52.4% | 45.7% |
| Terminal-Bench 2.0 | 75.1% | 60.0% | 46.3% | 38.2% |
| Toolathlon | 54.6% | 42.9% | 35.5% | 26.9% |
| GPQA Diamond | 93.0% | 88.0% | 82.8% | 81.6% |
| OSWorld-Verified | 75.0% | 72.1% | 39.0% | 42.0% |
GPT-5.4 mini scores 54.4% on SWE-Bench Pro — within 3.3 points of the full GPT-5.4. That's a coding agent that handles real-world software engineering at a fraction of the cost. On OSWorld-Verified (computer use tasks), it matches the flagship almost exactly at 72.1% vs 75.0%.
GPT-5.4 nano beats GPT-5 mini on SWE-Bench Pro (52.4% vs 45.7%) despite being a tier lower and far cheaper. On GPQA Diamond (graduate-level reasoning), nano scores 82.8% — barely below GPT-5 mini's 81.6%.
📊 Quick Math: If you're running GPT-5 mini today at $0.25/$2.00, switching to GPT-5.4 nano at $0.20/$1.25 gets you better coding performance (52.4% vs 45.7% SWE-Bench) while saving 20% on input and 37.5% on output.
GPT-5.4 mini vs the competition: full pricing comparison
The budget AI space is crowded. Here's how GPT-5.4 mini stacks up against every relevant competitor:
| Model | Provider | Input/1M | Output/1M | Context | Category |
|---|---|---|---|---|---|
| GPT-5.4 mini | OpenAI | $0.75 | $4.50 | 400K | Efficient |
| Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 | 200K | Efficient |
| Gemini 3 Flash | $0.50 | $3.00 | 1,000K | Efficient | |
| Gemini 2.5 Flash | $0.30 | $2.50 | 1,000K | Efficient | |
| o4-mini | OpenAI | $1.10 | $4.40 | 2,000K | Reasoning |
| Mistral Large 3 | Mistral AI | $0.50 | $1.50 | 256K | Flagship |
| DeepSeek V3.2 | DeepSeek | $0.28 | $0.42 | 128K | Efficient |
GPT-5.4 mini undercuts Claude Haiku 4.5 by 25% on input and 10% on output. Against Gemini 3 Flash, it's more expensive ($0.75 vs $0.50 input, $4.50 vs $3.00 output) but the benchmarks tell a different story — GPT-5.4 mini's SWE-Bench Pro score of 54.4% likely outpaces Flash models on complex coding.
DeepSeek V3.2 remains the price king at $0.28/$0.42, but it operates at a different capability tier with a 128K context window vs GPT-5.4 mini's 400K.
GPT-5.4 nano vs ultra-budget models
Nano plays in an even more aggressive price bracket:
| Model | Provider | Input/1M | Output/1M | Context |
|---|---|---|---|---|
| GPT-5.4 nano | OpenAI | $0.20 | $1.25 | 128K |
| GPT-5 nano | OpenAI | $0.05 | $0.40 | 128K |
| GPT-4.1 nano | OpenAI | $0.10 | $0.40 | 128K |
| Gemini 2.5 Flash-Lite | $0.10 | $0.40 | 1,000K | |
| Gemini 2.0 Flash-Lite | $0.075 | $0.30 | 1,000K | |
| Mistral Small 3.2 | Mistral AI | $0.06 | $0.18 | 128K |
| DeepSeek V3.2 | DeepSeek | $0.28 | $0.42 | 128K |
GPT-5.4 nano is not the cheapest option. Mistral Small 3.2 at $0.06/$0.18 and Gemini Flash-Lite models at $0.075-$0.10 input are significantly cheaper. GPT-5 nano itself was $0.05/$0.40 — four times cheaper on input.
But GPT-5.4 nano's value proposition isn't raw price — it's performance-per-dollar. On SWE-Bench Pro, nano scores 52.4% compared to GPT-5 mini's 45.7%. You're paying 4x more than GPT-5 nano but getting a model that can actually handle coding and reasoning tasks.
⚠️ Warning: If your use case is pure classification or data extraction where quality differences are minimal, the Gemini Flash-Lite models at $0.075-$0.10/M input are still the most cost-effective option. GPT-5.4 nano makes sense when you need better reasoning at a budget price point.
Real-world cost calculations
Let's put these prices into concrete scenarios developers actually face.
Coding agent (500 tasks/day)
A typical coding agent task uses ~4,000 input tokens (system prompt + code context) and ~2,000 output tokens (code changes + explanation):
| Model | Daily Input Cost | Daily Output Cost | Monthly Total |
|---|---|---|---|
| GPT-5.4 | $5.00 | $15.00 | $600.00 |
| GPT-5.4 mini | $1.50 | $4.50 | $180.00 |
| GPT-5.4 nano | $0.40 | $1.25 | $49.50 |
| Claude Haiku 4.5 | $2.00 | $5.00 | $210.00 |
| Gemini 3 Flash | $1.00 | $3.00 | $120.00 |
GPT-5.4 mini saves $420/month over the full GPT-5.4 while retaining 94% of its SWE-Bench performance. For most coding workflows, that's the sweet spot.
Customer support chatbot (10,000 conversations/day)
Average conversation: ~1,500 input tokens, ~500 output tokens:
| Model | Daily Cost | Monthly Cost |
|---|---|---|
| GPT-5.4 mini | $33.75 | $1,012.50 |
| GPT-5.4 nano | $10.25 | $307.50 |
| Claude Haiku 4.5 | $40.00 | $1,200.00 |
| Gemini 2.5 Flash | $7.00 | $210.00 |
| Mistral Small 3.2 | $1.80 | $54.00 |
For high-volume chatbots where reasoning quality matters, GPT-5.4 nano at $307.50/month offers a compelling balance. But if quality requirements are modest, Mistral Small 3.2 at $54/month is hard to argue with.
📊 Quick Math: Switching a 10,000-conversation/day chatbot from Claude Haiku 4.5 to GPT-5.4 mini saves $187.50/month (15.6% reduction). Switching to GPT-5.4 nano saves $892.50/month (74.4% reduction).
RAG pipeline (1M queries/month)
Typical RAG query: ~3,000 input tokens (query + retrieved chunks), ~800 output tokens:
| Model | Monthly Input | Monthly Output | Total |
|---|---|---|---|
| GPT-5.4 mini | $2,250 | $3,600 | $5,850 |
| GPT-5.4 nano | $600 | $1,000 | $1,600 |
| Gemini 2.5 Flash | $900 | $2,000 | $2,900 |
| DeepSeek V3.2 | $840 | $336 | $1,176 |
At scale, the differences become massive. DeepSeek V3.2 and GPT-5.4 nano compete for the budget RAG crown, with DeepSeek winning on raw cost but GPT-5.4 nano offering stronger reasoning for complex queries.
[stat] $5,850/mo Cost of running 1M RAG queries on GPT-5.4 mini — down from $20,250 on GPT-5.4 full
The subagent architecture play
OpenAI explicitly positions GPT-5.4 mini for subagent workflows — and this is where the model really shines. The idea: use GPT-5.4 (the full model) as an orchestrator that delegates specific subtasks to cheaper, faster mini instances.
In Codex, GPT-5.4 mini runs at 30% of the GPT-5.4 quota. That means for every orchestrator task, you can spawn three mini subagents at equivalent cost. For workflows like:
- Codebase search: Mini scans files and returns relevant snippets
- Code review: Mini checks individual files while the orchestrator plans the overall change
- Document processing: Mini handles each document while the orchestrator synthesizes
This pattern lets you keep flagship quality for decision-making while running the bulk of work on efficient models. The math works out to roughly 50-60% cost savings versus running everything on GPT-5.4, with minimal quality loss on the subtasks.
💡 Key Takeaway: Don't think of GPT-5.4 mini as a standalone model — think of it as the execution tier in a multi-model architecture. Pair it with GPT-5.4 for orchestration and nano for classification/routing, and you've built a cost-optimized pipeline.
Multimodal and computer use performance
One surprise in the benchmarks: GPT-5.4 mini is exceptional at computer use tasks. On OSWorld-Verified, it scores 72.1% — only 2.9 points behind the full GPT-5.4 and dramatically ahead of GPT-5 mini (42.0%).
This matters for anyone building:
- Browser automation agents that need to interpret screenshots
- Desktop automation workflows
- Quality assurance tools that visually verify UI states
- Accessibility testing that reads and interprets interfaces
On MMMUPro (multimodal understanding), GPT-5.4 mini scores 76.6% vs GPT-5.4's 81.2%. For vision tasks that don't need maximum precision, mini handles them at one-third the cost.
GPT-5.4 nano drops significantly on these tasks — 39.0% on OSWorld — making it unsuitable for computer use workflows. Stick with mini or the full model for anything visual.
Tool calling and agentic workflows
GPT-5.4 mini shows strong tool-calling capabilities:
| Benchmark | GPT-5.4 | GPT-5.4 mini | GPT-5.4 nano | GPT-5 mini |
|---|---|---|---|---|
| MCP Atlas | 67.2% | 57.7% | 56.1% | 47.6% |
| Toolathlon | 54.6% | 42.9% | 35.5% | 26.9% |
| τ2-bench (telecom) | 98.9% | 93.4% | 92.5% | 74.1% |
On τ2-bench (domain-specific tool use), both mini (93.4%) and nano (92.5%) approach the flagship model's 98.9%. This means for structured, domain-specific agent workflows, even nano can reliably call tools.
The MCP Atlas benchmark is particularly relevant — it tests Model Context Protocol integration, which is becoming the standard for AI tool calling. GPT-5.4 mini's 57.7% score represents a 21% improvement over GPT-5 mini's 47.6%.
When to use each GPT-5.4 variant
Here's the decision framework:
GPT-5.4 ($2.50/$15.00) — Use when:
- You need maximum quality on novel, complex tasks
- Orchestrating multi-agent workflows
- Tasks where a 3-5% quality difference matters (legal, medical, financial)
- Long-context reasoning over 400K tokens
GPT-5.4 mini ($0.75/$4.50) — Use when:
- Coding agents and code review
- Computer use and screenshot interpretation
- Production chatbots requiring strong reasoning
- Subagent execution in multi-model architectures
- Context needs up to 400K tokens
GPT-5.4 nano ($0.20/$1.25) — Use when:
- Classification and routing
- Data extraction and parsing
- Simple coding subagents
- High-volume, latency-sensitive workloads
- Budget-constrained projects where DeepSeek/Mistral aren't an option
Skip GPT-5.4 mini/nano when:
- Pure cost optimization is the goal (Gemini Flash-Lite and Mistral Small are cheaper)
- You need massive context windows (Gemini offers 1M-2M tokens)
- You're already on DeepSeek V3.2 and quality is sufficient
✅ TL;DR: GPT-5.4 mini is the new production workhorse for teams on OpenAI. It delivers 94% of GPT-5.4's coding ability at 30% of the price. Nano is the budget tier for structured tasks. Neither replaces DeepSeek or Gemini Flash-Lite for pure cost optimization.
Should you update your models.json pricing?
If you're tracking AI costs (and you should be — try our AI cost calculator), here are the entries to add:
- GPT-5.4 mini: $0.75 input, $4.50 output, 400K context
- GPT-5.4 nano: $0.20 input, $1.25 output, 128K context
Both models are available now via the OpenAI API. GPT-5.4 mini also works in ChatGPT (Free/Go users via Thinking mode) and Codex.
Use our token pricing comparison to see exactly how many tokens you get per dollar across all providers, or check the complete AI API pricing guide for the full landscape.
Frequently asked questions
How much does GPT-5.4 mini cost?
GPT-5.4 mini costs $0.75 per million input tokens and $4.50 per million output tokens. That's 70% cheaper than the full GPT-5.4 model ($2.50/$15.00). For a typical coding task using 4,000 input and 2,000 output tokens, each request costs approximately $0.012.
Is GPT-5.4 mini better than Claude Haiku 4.5?
GPT-5.4 mini is 25% cheaper on input ($0.75 vs $1.00) and 10% cheaper on output ($4.50 vs $5.00) compared to Claude Haiku 4.5. On coding benchmarks like SWE-Bench Pro, GPT-5.4 mini scores 54.4% — likely competitive with Haiku 4.5. GPT-5.4 mini also has a larger context window (400K vs 200K). For most workloads, GPT-5.4 mini offers better value.
What's the difference between GPT-5.4 mini and GPT-5.4 nano?
GPT-5.4 mini ($0.75/$4.50) is designed for production workloads requiring strong reasoning — coding agents, chatbots, and computer use. GPT-5.4 nano ($0.20/$1.25) is stripped down for speed and cost — classification, data extraction, and simple subagent tasks. Mini has a 400K context window and 32K max output; nano has 128K context and 8K max output. On SWE-Bench Pro, mini scores 54.4% vs nano's 52.4%.
Should I switch from GPT-5 mini to GPT-5.4 mini?
If quality matters, yes. GPT-5.4 mini scores 54.4% on SWE-Bench Pro vs GPT-5 mini's 45.7% — a 19% improvement. However, GPT-5.4 mini costs 3x more on input ($0.75 vs $0.25) and 2.25x more on output ($4.50 vs $2.00). For cost-sensitive workloads where GPT-5 mini's quality is adequate, sticking with the older model saves money. Use our calculator to compare costs for your specific usage pattern.
Is GPT-5.4 nano cheaper than DeepSeek V3.2?
On input tokens, yes — GPT-5.4 nano is 28.6% cheaper ($0.20 vs $0.28). On output tokens, no — nano costs $1.25 vs DeepSeek's $0.42, making DeepSeek nearly 3x cheaper on output. For output-heavy workloads (generation, writing), DeepSeek V3.2 is still more economical. For input-heavy workloads (classification, extraction with short outputs), GPT-5.4 nano can be competitive.
