If you searched for Gemini 2.5 Flash pricing, Gemini 2.5 Pro pricing, or Gemini API free tier limits, here's the clean answer.
Updated May 2026: Google's live Gemini pricing moved, and the stale numbers floating around search are now bad budgeting advice — especially for Gemini 2.5 Flash.
Gemini API pricing quick answer (May 2026)
- Gemini 2.5 Flash: $0.30 input / $2.50 output per 1M tokens
- Gemini 2.5 Pro: $1.25 input / $10.00 output per 1M tokens
- Gemini 3 Pro: $2.00 input / $12.00 output per 1M tokens
- Cheapest current Gemini text model: Gemini 2.0 Flash-Lite at $0.075 / $0.30
- Free tier: available through Google's Gemini / AI Studio flow, but live limits vary by model and usage tier
- Monthly fee: none on the standard API path — Gemini is still pay-as-you-go
The biggest change worth saying bluntly: if you still see Gemini 2.5 Flash quoted at $0.15 / $0.60, that is not the current planning number.
📊 Quick Math: 100M Gemini 2.5 Flash output tokens now cost $250, not $60. That one stale assumption alone can wreck a real production budget.
For the broader landscape across OpenAI, Anthropic, xAI, Mistral, DeepSeek, and Meta, use our AI API pricing guide.
Current Google Gemini pricing table (2026)
This table focuses on the live generative lineup developers actually use for chat, extraction, reasoning, and multimodal work.
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context Window | Best fit |
|---|---|---|---|---|
| Gemini 3 Pro | $2.00 | $12.00 | 2,000,000 | Highest-end long-context Google workloads |
| Gemini 3.1 Pro | $2.00 | $12.00 | 1,000,000 | Current Pro preview lane |
| Gemini 2.5 Pro | $1.25 | $10.00 | 2,000,000 | Premium reasoning with giant context |
| Gemini 3 Flash | $0.50 | $3.00 | 1,000,000 | Faster production general-purpose work |
| Gemini 2.5 Flash | $0.30 | $2.50 | 1,000,000 | Balanced multimodal / long-context production |
| Gemini 3.1 Flash-Lite Preview | $0.25 | $1.50 | 1,000,000 | Cheap preview-tier high-volume tasks |
| Gemini 2.0 Flash | $0.10 | $0.40 | 1,000,000 | Cheap reliable mainstream lane |
| Gemini 2.5 Flash-Lite | $0.10 | $0.40 | 1,000,000 | Low-cost lightweight production |
| Gemini 2.0 Flash-Lite | $0.075 | $0.30 | 1,000,000 | Cheapest current Gemini text route |
💡 Key takeaway: the live Gemini lineup is still attractive for context-heavy work, but the old story that Gemini 2.5 Flash is absurdly cheap is stale. The true low-cost lanes now sit lower in the stack: 2.0 Flash, 2.5 Flash-Lite, and 2.0 Flash-Lite.
I am leaving Gemini Embedding 2 out of the main table because most searches for “Gemini API pricing” mean the generative models, not the embedding endpoint.
Exact answers to the high-intent Gemini pricing searches
Gemini 2.5 Flash pricing per million tokens (2026)
The current number is $0.30 input / $2.50 output per 1M tokens.
That matters because a lot of ranking pages were built around an older $0.15 / $0.60 framing. If you budget from that older number, you will understate real output cost badly.
[stat] $2.50/M Current Gemini 2.5 Flash output pricing per 1M tokens — the number most likely to surprise teams using older comparison posts.
When Gemini 2.5 Flash still makes sense:
- you want Google's 1M context window
- you need a more capable middle tier than Flash-Lite
- your workload benefits from Gemini's multimodal stack
When it does not:
- you are optimizing for raw cheapest text generation
- your app is output-heavy and you assumed the old bargain pricing still held
Gemini 2.5 Pro API pricing per million tokens
Gemini 2.5 Pro currently costs $1.25 input / $10.00 output per 1M tokens.
That makes it the more interesting premium value lane than a lot of teams expect. You get a 2M-token context window without paying Gemini 3 Pro rates.
If your workload is:
- large document review
- long-context research synthesis
- big codebase analysis
- premium reasoning where context size matters
...then 2.5 Pro is often the more rational starting point than jumping straight to 3 Pro.
Gemini 2.0 Flash API pricing
Gemini 2.0 Flash currently sits at $0.10 / $0.40 per 1M tokens.
This is the number many budget-minded teams should care about more than 2.5 Flash. It is not the fanciest Gemini model, but it is one of the cleanest low-cost production defaults in Google's stack.
Gemini API free tier limits and rate limits
Google's free Gemini API access is real, but the annoying truth is that there is no one timeless screenshot-friendly answer.
What matters in practice:
- limits vary by model and usage tier
- Google measures RPM, TPM, and RPD
- your live limits are best checked in AI Studio
- paid tiers unlock higher throughput and broader production usage
So if you searched “Gemini API free tier limits 2026”, the honest answer is:
Check AI Studio for the exact live project limits, then use the pricing table here for the paid planning baseline.
Older Gemini 1.5 Flash / 1.5 Pro / 2.0 Pro pricing searches
These searches still show up because search demand always lags product reality.
The practical rule:
- if you are planning new work, budget from the live lineup above
- if you are auditing an older app, verify the exact model slug before trusting an old spreadsheet
- if a blog post still centers an older Gemini family without updating the current equivalents, it is probably stale
Why Gemini pricing guides keep disagreeing
Most disagreement is not fraud. It is timeline drift.
Google has changed the active Gemini lineup fast enough that older pages can still rank while being numerically wrong. A guide published around a previous Gemini 2.5 Flash pricing snapshot can look polished, earn links, and still be useless for 2026 planning. The same thing happens when older model families like Gemini 1.5 Pro or Gemini 2.0 Pro keep pulling search demand after the commercial reality has moved on.
That is why the safest workflow is boring but effective:
- check the live model lineup
- verify the current per-million-token prices
- verify the context window
- only then estimate cost
✅ Practical rule: if a Gemini guide does not clearly say it was refreshed for the current pricing table, do not trust it with real budget numbers.
Flash vs Pro: which Gemini tier is actually worth paying for?
Use Flash or Flash-Lite when:
- you are doing classification, extraction, summarization, routing, or cheap chat
- you need very large context without premium-model costs
- you care more about volume economics than bragging rights
Use Pro when:
- answer quality directly affects revenue, legal risk, or customer trust
- you are analyzing huge source sets in one pass
- you need a serious long-context reasoning lane rather than a cheap default
My blunt take: Gemini 2.0 Flash is the underrated budget workhorse, and Gemini 2.5 Pro is the smarter premium default than most teams realize.
Real workload math with the current Gemini prices
Scenario 1: Customer support chatbot (300,000 conversations/month)
Assumptions: 800 input tokens and 400 output tokens per conversation.
| Model | Monthly input cost | Monthly output cost | Total/month |
|---|---|---|---|
| Gemini 3 Pro | $480 | $1,440 | $1,920 |
| Gemini 3 Flash | $120 | $360 | $480 |
| Gemini 2.5 Flash | $72 | $300 | $372 |
| Gemini 2.0 Flash | $24 | $48 | $72 |
| Gemini 2.0 Flash-Lite | $18 | $36 | $54 |
The old version of this discussion used to make 2.5 Flash look like the obvious cheap default. With the current prices, that is no longer true. It is still useful — just no longer the budget king.
Scenario 2: Large-document review (50,000 jobs/month)
Assumptions: 2,000 input tokens and 500 output tokens per job.
| Model | Total/month |
|---|---|
| Gemini 3 Pro | $500 |
| Gemini 2.5 Pro | $375 |
| Gemini 3 Flash | $125 |
| Gemini 2.5 Flash | $122.50 |
| Gemini 2.0 Flash | $20.00 |
This is where the context story matters. If you genuinely need long-context quality, 2.5 Pro is compelling. If you mostly need cheap processing with acceptable quality, 2.0 Flash is comically cheaper.
Scenario 3: 1M search-style answers per month
Assumptions: 300 input tokens and 200 output tokens per request.
| Model | Total/month |
|---|---|
| Gemini 3 Pro | $3,000 |
| Gemini 3 Flash | $750 |
| Gemini 2.5 Flash | $590 |
| Gemini 2.0 Flash | $110 |
If your workload is output-heavy, output pricing dominates the story fast. That is why stale Gemini 2.5 Flash numbers are dangerous.
Does Gemini still win on value in 2026?
Yes — but only in the lanes where its strengths actually matter.
Where Gemini is still excellent
- Huge context at sane prices, especially on 2.5 Pro and 3 Pro
- Cheap large-context production with 2.0 Flash or 2.5 Flash-Lite
- Multimodal work if you want one provider for text + image-heavy flows
Where the old Gemini narrative broke
- Gemini 2.5 Flash is no longer the obvious bargain headline
- output-heavy workloads can get expensive faster than old comparison posts suggest
- “Gemini is always the cheapest good default” is not a serious statement anymore
That does not make Gemini bad. It just means the lazy version of the pitch expired.
Gemini free tier vs paid Gemini API: the useful version
| Option | What you get | Best for | Catch |
|---|---|---|---|
| Free access / trial path | Limited access, testing, experimentation | Prototypes and evals | Lower live limits |
| Paid API | Usage-based billing, higher throughput, production use | Real apps | You pay per token |
| Higher usage tiers | Bigger ceilings and more operational room | Scaling teams | Qualification depends on billing history and usage |
There is still no flat monthly Gemini API subscription fee on the standard API path. That's the key answer for searches like “Gemini API monthly cost” or “Gemini API subscription fee.”
Bottom line: which Gemini model should you start with?
- Cheapest current Gemini text lane: Gemini 2.0 Flash-Lite
- Best low-cost practical default: Gemini 2.0 Flash
- Best premium long-context value: Gemini 2.5 Pro
- Best absolute Google flagship: Gemini 3 Pro
- Model most likely to be mis-budgeted from stale search results: Gemini 2.5 Flash
If you want the short recommendation: start with 2.0 Flash for cost-sensitive production, escalate to 2.5 Pro when quality and context really matter, and stop trusting old 2.5 Flash $0.15/$0.60 comparisons.
For side-by-side provider benchmarking, also read:
Frequently asked questions
Is Gemini 2.5 Flash still a cheap default in 2026?
Not really. It is still useful, but the cheap-default story now belongs more to Gemini 2.0 Flash and Gemini 2.0 Flash-Lite. The current 2.5 Flash output price is high enough that output-heavy apps need to model cost carefully.
What is the cheapest live Gemini API model?
For standard text-style planning, Gemini 2.0 Flash-Lite is the cheapest live Gemini route in the current lineup at $0.075 input / $0.30 output per 1M tokens.
Which Gemini model is best for long documents?
If quality matters, start with Gemini 2.5 Pro. It gives you a 2M-token context window and a better premium-value profile than jumping straight to Gemini 3 Pro for every workload.
Does Gemini API billing require a monthly subscription?
No. Normal Gemini API billing is still pay-as-you-go. You pay for usage rather than a flat subscription fee.
Should I trust old Gemini 1.5 or early 2.5 pricing screenshots?
No. Treat them as historical references only. For current budgeting, use the live pricing table and current model lineup instead of recycled screenshots, old tweet threads, or unchanged blog posts.
