xAI Grok Pricing Guide 2026: Live API Prices, Voice Costs, Rate Limits & Retired Models
If you're still budgeting Grok with the old $0.20/$0.50 Grok 4.1 Fast numbers, you're budgeting a ghost.
xAI's public docs changed materially in May 2026. The live API lineup now centers on Grok 4.3 and Grok 4.20, and xAI explicitly says several older text model slugs — including Grok 4.1 Fast, Grok 4, and Grok Code Fast 1 — were retired on May 15, 2026 and now redirect to Grok 4.3 pricing.
That matters because the old cheap Grok story was a click magnet. The new reality is different: current xAI text pricing is $1.25 per million input tokens and $2.50 per million output tokens across the live Grok 4.3 / 4.20 family. That's still usable, but it is nowhere near the old bargain-bin pitch.
This guide is the freshness fix. We'll cover the live xAI token prices, what the retired model redirect means for your bill, voice API pricing, and which Grok route makes sense for real workloads. If you want the side-by-side landscape, also check our broader AI API pricing guide and the live xAI provider page.
What xAI's public docs show right now
Here is the current xAI API pricing picture reflected in the live May 2026 docs.
| Model | Context Window | Input (per 1M tokens) | Output (per 1M tokens) | Best Use |
|---|---|---|---|---|
| Grok 4.3 | 1M | $1.25 | $2.50 | Default text, coding, chat |
| Grok 4.20 Reasoning | 1M | $1.25 | $2.50 | Structured reasoning workflows |
| Grok 4.20 Non-Reasoning | 1M | $1.25 | $2.50 | Faster text tasks without explicit reasoning mode |
| Grok 4.20 Multi-Agent | 2M | $1.25 | $2.50 | Multi-agent orchestration, long-context runs |
💡 Key takeaway: xAI no longer has a clear “cheap Grok text model” in the public lineup. The live pricing has collapsed into one token-price band: $1.25 input / $2.50 output.
That simplifies planning, but it also kills the old “Grok 4.1 Fast is cheaper than everyone” angle. If you see content still leaning on that story, it is stale.
The biggest pricing gotcha: Grok 4.1 Fast is no longer the planning baseline
The old version of this page led with Grok 4.1 Fast because it was a real edge: $0.20 input / $0.50 output with a giant context window. That was compelling. It's also no longer the number you should trust for forward budgeting.
xAI's retirement notice says deprecated text slugs redirect to Grok 4.3 and are billed at standard Grok 4.3 pricing. If your internal spreadsheet still assumes Grok 4.1 Fast costs, your forecast is too low.
Old baseline vs live redirected baseline
| Model Assumption | Input / 1M | Output / 1M | What it means now |
|---|---|---|---|
| Old Grok 4.1 Fast assumption | $0.20 | $0.50 | Historical reference only |
| Live Grok 4.3 billing reality | $1.25 | $2.50 | Use this for current budgeting |
That is a 6.25x jump on input and a 5x jump on output versus the old Grok 4.1 Fast story.
[stat] 6.25x higher input cost That's the budgeting shock if an old Grok 4.1 Fast spreadsheet silently becomes a Grok 4.3 bill.
For anyone doing cost-sensitive support, extraction, or batch workflows, that is not a rounding error. It's the whole ballgame.
The practical takeaway is simple: if you inherited an older xAI integration, audit the model name before you audit your prompts. A lot of teams try to optimize token counts first, but if the slug itself has moved to a more expensive billing tier, prompt trimming is cleanup after the real leak.
Example: 100,000 small assistant requests per day
Assume each request uses 500 input tokens and 300 output tokens.
- Old Grok 4.1 Fast math: about $25/day
- Live Grok 4.3 math: about $137.50/day
Same workload. Very different invoice.
So the first job of any Grok pricing page in May 2026 is not persuasion — it's telling the truth.
Real per-request Grok costs
Since the live text family shares the same token rates, the main cost difference between Grok 4.3 and Grok 4.20 variants is not price. It is workflow fit.
Here are the real per-request costs at xAI's live text rates.
Simple assistant reply (500 input / 300 output tokens)
| Pricing Basis | Cost per Request |
|---|---|
| Grok 4.3 / Grok 4.20 family | $0.001375 |
Document summarization (3,000 input / 800 output tokens)
| Pricing Basis | Cost per Request |
|---|---|
| Grok 4.3 / Grok 4.20 family | $0.00575 |
Deep analysis or code review (10,000 input / 2,000 output tokens)
| Pricing Basis | Cost per Request |
|---|---|
| Grok 4.3 / Grok 4.20 family | $0.0175 |
📊 Quick math: one million assistant-style requests at 500 input / 300 output tokens would cost roughly $1,375 at live xAI text rates.
That is not outrageous for production AI. But it is expensive enough that model routing suddenly matters again. If you only need a lightweight classifier or cheap summarizer, there are lower-cost options elsewhere in the market.
Grok 4.3 vs Grok 4.20: how to choose when the price is identical
Because xAI's live text lineup is priced the same, your choice is about execution style.
Use Grok 4.3 when:
- you want the default current Grok model
- you need coding or chat without extra orchestration overhead
- you want the safest choice for general-purpose API usage
- you're replacing an older Grok text slug and just need the current landing point
Use Grok 4.20 Reasoning when:
- your workflow explicitly benefits from a reasoning-oriented route
- you're running structured multi-step analysis
- you want to test reasoning behavior without paying a separate premium tier
Use Grok 4.20 Non-Reasoning when:
- you want the same price band without reasoning mode overhead
- your workload is straightforward generation, extraction, or transformation
- latency matters more than chain-of-thought style behavior
Use Grok 4.20 Multi-Agent when:
- you need the 2M-token context window
- you're orchestrating agents across a long working memory
- you're processing big repositories, large reports, or long-running task state
If you want a direct side-by-side breakdown, our live compare page for Grok 4.20 vs Grok 4.3 is the fastest way to see the tradeoff.
Voice mode pricing: what xAI actually publishes
Some searchers landing on Grok pricing pages are really trying to answer a voice question: What does Grok voice mode cost now?
For API budgeting, xAI currently publishes voice pricing, not a simple consumer-plan “voice mode limit” cheat sheet on the model page. The public API numbers are:
| Voice Product | Price |
|---|---|
| Realtime | $0.05/minute ($3.00/hour) |
| Text to Speech | $15.00 / 1M characters |
| Speech to Text (REST) | $0.10/hour |
| Speech to Text (Streaming) | $0.20/hour |
That means a 10-minute realtime voice session costs about $0.50, while an hour costs $3.00.
⚠️ Important: API voice pricing is not the same thing as a consumer Grok subscription. If you're costing an application, use the published API rate card above — not screenshots of app plans floating around social posts.
Grok voice mode rate limits in 2026: what is public vs what is not
This is where a lot of Grok searches get messy. People searching "xAI Grok voice mode rate limits 2026" are often mixing together two different questions:
- API quotas and throughput for a product built on xAI voice endpoints
- Consumer app usage limits inside Grok subscriptions or app plans
xAI's public docs are much clearer on price than on one universal public voice mode rate-limit table. The durable numbers xAI publishes are the rate-card numbers above. The actual caps you can hit in production are better thought of as account-level quotas — the kind of limits you verify in the xAI dashboard or current docs before launch, not static SEO-table facts you should copy forever.
So the practical rule is:
- use $0.05/minute realtime and the published TTS/STT rates for budgeting
- verify current API quotas in xAI's docs or dashboard for your specific account
- do not assume consumer app voice caps are the same thing as API rate limits
- if voice is mission-critical, build retries, queuing, and fallback handling instead of assuming infinite realtime capacity
If you are comparing providers for a production voice workflow, the safer move is to treat price and rate limits as separate checks. Pricing tells you what success costs. Quotas tell you whether success arrives all at once or gets throttled.
Where xAI is still strong
Even with the cheap legacy text slugs gone, xAI still has a few legitimate hooks.
1. One price band makes planning easy
You don't need a messy matrix of cheap / mid / premium Grok text models anymore. If you're choosing between Grok 4.3 and Grok 4.20 variants, the token price is the same.
2. Multi-Agent keeps the long-context story alive
The 2M-token context window on Grok 4.20 Multi-Agent is still interesting for repository analysis, long-document workflows, and agent stacks that need a fat working memory.
3. The voice pricing is clear
The realtime $0.05/minute rate is easy to model. That's refreshing compared with providers that bury audio pricing across multiple docs.
Where xAI lost its old pricing edge
Let's be blunt: the retired Grok 4.1 Fast era was a stronger SEO story than the current lineup.
The old pitch was: huge context, modern model, suspiciously cheap. That attracts clicks.
The current pitch is more sober:
- good modern Grok models
- one unified price band
- strong long-context option
- but no obvious “why is this so much cheaper than everyone else?” hook
That's why freshness matters here. Searchers looking for Grok pricing are often trying to reconcile old claims with current bills. The winning page is the one that explains the change clearly instead of pretending nothing happened.
If you specifically want to see how the old cheap route compares with the live default, our Grok 4.1 Fast vs Grok 4.3 comparison shows the pricing reset in one screen.
When xAI still makes sense despite the reset
xAI is still a rational choice when the product constraint is about context shape or Grok-specific behavior, not just raw price. If you need the 2M-token Multi-Agent window, want one flat pricing band across your live Grok routes, or you're already standardized on xAI tooling, the current lineup can still work fine.
Where I would be more skeptical is high-volume commodity work: spam filtering, basic extraction, cheap summarization, or any workload where the model is mostly acting like plumbing. In those cases, the disappearance of Grok 4.1 Fast as a true live budget default means you should compare xAI much harder against lower-cost alternatives before you lock in.
Monthly cost examples at live xAI rates
Here are a few rough planning scenarios using the current $1.25 / $2.50 token pricing.
Customer support assistant
Assume 20,000 requests/day at 500 input / 300 output
- Daily cost: about $27.50
- Monthly cost: about $825
Internal research copilot
Assume 5,000 requests/day at 3,000 input / 800 output
- Daily cost: about $28.75
- Monthly cost: about $862.50
Code review or analysis workload
Assume 2,000 requests/day at 10,000 input / 2,000 output
- Daily cost: about $35.00
- Monthly cost: about $1,050
Those numbers are still workable for teams with real leverage. They just no longer justify calling Grok the budget king.
Frequently asked questions
How much does the Grok API cost now?
For the live public text lineup, xAI currently shows $1.25 per million input tokens and $2.50 per million output tokens across Grok 4.3 and the Grok 4.20 family.
What happened to Grok 4.1 Fast and Grok 4?
xAI's May 15, 2026 retirement notice says those older text slugs were deprecated. Requests to deprecated text slugs now redirect to Grok 4.3 and are billed at Grok 4.3 pricing.
Is Grok 4.20 cheaper than Grok 4.3?
No. Based on xAI's current public pricing, they cost the same. Pick based on workflow style and context requirements, not token price.
How much does Grok voice mode cost?
For API use, xAI currently lists Realtime voice at $0.05/minute, TTS at $15 per million characters, and STT at $0.10/hour REST or $0.20/hour streaming.
What are Grok voice mode rate limits in 2026?
xAI publicly documents voice pricing, but it does not publish one universal public rate-limit table that reliably answers every "Grok voice mode limits" query. For API work, treat quotas as account-level limits you confirm in current xAI docs or your dashboard, and do not assume consumer app plan caps match API throughput.
Should I still trust old Grok pricing blog posts?
Only if they were updated after the May 2026 retirement changes. Anything still anchored on Grok 4.1 Fast as the live cheap default is behind reality.
Start with the live xAI numbers, not the nostalgic ones
The cleanest way to think about Grok pricing in May 2026 is this:
- Live text baseline: $1.25 input / $2.50 output
- Voice baseline: $0.05/min realtime
- Old cheap slugs: useful for historical context, not forward budgeting
That doesn't make xAI bad. It just makes sloppy Grok pricing pages worse.
If you want to compare xAI against the rest of the field before committing, start here:
- xAI provider pricing
- Grok 4.20 vs Grok 4.3
- Cheapest AI APIs in 2026
- Best value AI models by price-performance
