Google Gemini API Pricing 2026: Gemini 3.5 Flash, 3.1 Pro & 2.5 Models
Updated May 2026. Current Gemini API pricing for Gemini 3.5 Flash, 3.1 Pro, 3.1 Flash-Lite, 2.5 Pro, 2.5 Flash, caching, Batch/Flex, and DeepSeek V4 comparison.
Google’s Gemini API is a strong option when you need long context, multimodal input, Google AI Studio workflow support, search grounding, Batch/Flex discounts, and free development access. This guide is refreshed against src/data/models.ts and Google’s official Gemini API pricing and model docs.
The newest pricing change to account for is Gemini 3.5 Flash. Google lists gemini-3.5-flash as a stable model with 1,048,576 input tokens, 65,536 output tokens, multimodal inputs, standard pricing at $1.50/M input and $9.00/M output, and cached input at $0.15/M.
The older stale claim to avoid remains the same: Gemini 2.5 Flash standard interactive pricing is not $0.15/M input and $0.60/M output. Google currently lists standard Gemini 2.5 Flash text/image/video pricing at $0.30/M input and $2.50/M output. The $0.15/M input tier applies to Batch/Flex, not standard interactive usage.
Gemini 2026 Model Lineup & Pricing
| Model | Standard Input | Cached Input | Standard Output | Context | Max Output | Best For |
|---|---|---|---|---|---|---|
| Gemini 3.5 Flash | $1.50/M | $0.15/M | $9.00/M | 1,048,576 | 65,536 | Fast frontier-level agents, coding loops, multimodal workloads |
| Gemini 3.1 Pro Preview | $2.00/M <=200K, $4.00/M >200K | $0.20/M <=200K, $0.40/M >200K | $12.00/M <=200K, $18.00/M >200K | 2M | 16K | Highest Gemini Pro tier in canonical data |
| Gemini 3.1 Flash-Lite Preview | $0.25/M | $0.025/M | $1.50/M | 1M | 64K | Fast, high-volume agentic and data tasks |
| Gemini 2.5 Pro | $1.25/M <=200K, $2.50/M >200K | $0.125/M <=200K, $0.25/M >200K | $10.00/M <=200K, $15.00/M >200K | 2M | 64K | Complex reasoning, coding, long documents |
| Gemini 2.5 Flash | $0.30/M text/image/video | $0.03/M text/image/video | $2.50/M | 1M | 64K | Balanced cost and capability |
| Gemini 2.5 Flash-Lite | $0.10/M text/image/video | $0.01/M text/image/video | $0.40/M | 1M | 64K | Lowest-cost current Gemini text route |
| Gemini 2.0 Flash | $0.10/M | n/a | $0.40/M | 1M | 8K | Deprecated legacy apps |
All prices are USD per 1 million tokens for Google Gemini Developer API standard paid usage. Google lists separate Batch, Flex, and Priority pricing for some models, and project-specific rate limits can differ from public examples.
What Is New in Gemini 3.5 Flash
Gemini 3.5 Flash is not just a price-table update. Google’s model page positions it for sustained frontier-level intelligence at higher speed and lower cost, especially for sub-agent deployment, multi-step workflows, long-horizon tasks, and coding cycles.
Operationally, the important details are:
| Field | Gemini 3.5 Flash |
|---|---|
| API model ID | gemini-3.5-flash |
| Status | Stable |
| Input types | Text, image, video, audio, PDF |
| Output type | Text |
| Input token limit | 1,048,576 |
| Output token limit | 65,536 |
| Supported capabilities | Batch API, caching, code execution, file search, Flex, function calling, Google Maps grounding, priority inference, search grounding, structured outputs, thinking, URL context |
| Not supported | Audio generation, computer use, image generation, Live API |
For DevTk.AI routing guidance, this makes Gemini 3.5 Flash a good premium Flash route, not the cheapest Gemini route.
Batch, Flex, Priority, and Context Caching
Google’s paid tier includes access to Batch API, Flex inference, and context caching. Batch API is listed as a 50% cost reduction on the pricing overview, and the detailed tables show lower Batch/Flex token rates for many models.
| Model | Standard Input | Batch/Flex Input | Standard Output | Batch/Flex Output |
|---|---|---|---|---|
| Gemini 3.5 Flash | $1.50/M | $0.75/M | $9.00/M | $4.50/M |
| Gemini 3.1 Pro Preview <=200K | $2.00/M | $1.00/M | $12.00/M | $6.00/M |
| Gemini 3.1 Flash-Lite Preview | $0.25/M | $0.125/M | $1.50/M | $0.75/M |
| Gemini 2.5 Pro <=200K | $1.25/M | $0.625/M | $10.00/M | $5.00/M |
| Gemini 2.5 Flash | $0.30/M | $0.15/M | $2.50/M | $1.25/M |
| Gemini 2.5 Flash-Lite | $0.10/M | $0.05/M | $0.40/M | $0.20/M |
This is the source of the older $0.15 Gemini 2.5 Flash number: it is valid for Batch/Flex input, not standard interactive input.
Gemini vs GPT-5 vs Claude vs DeepSeek
| Model | Provider | Input | Cached Input | Output | Notes |
|---|---|---|---|---|---|
| Gemini 3.5 Flash | $1.50 | $0.15 | $9.00 | Stable 1M-context multimodal Flash route | |
| Gemini 3.1 Pro Preview | $2.00 | $0.20 | $12.00 | $4/$18 above 200K prompt tokens | |
| Gemini 2.5 Pro | $1.25 | $0.125 | $10.00 | $2.50/$15 above 200K prompt tokens | |
| GPT-5 | OpenAI | $1.25 | varies | $10.00 | Same base price as Gemini 2.5 Pro |
| Claude Sonnet 4.6 | Anthropic | $3.00 | varies | $15.00 | Higher price, 1M canonical context |
| Gemini 2.5 Flash | $0.30 | $0.03 | $2.50 | Standard interactive price | |
| Gemini 2.5 Flash-Lite | $0.10 | $0.01 | $0.40 | Lowest-cost current Gemini route | |
| DeepSeek V4 Flash | DeepSeek | $0.14 | $0.0028 | $0.28 | Much cheaper text alternative |
| DeepSeek V4 Pro | DeepSeek | $0.435 | $0.003625 | $0.87 | Official 1/4-of-original price after permanent cut |
Key takeaways:
- Gemini 3.5 Flash is a premium Flash route, not a budget route. It costs 5x Gemini 2.5 Flash on input and 3.6x on output.
- DeepSeek V4 Flash is dramatically cheaper for text agents at $0.14/$0.28, especially when cache hits are high.
- Gemini 3.5 Flash is stronger for multimodal and Google ecosystem workloads because it supports text, image, video, audio, PDF input, search grounding, Maps grounding, URL context, and Google-native tooling.
- Gemini 2.5 Flash-Lite remains the lowest-cost Gemini route at $0.10/$0.40 for standard interactive text/image/video input.
Monthly Cost Estimates
These examples assume standard interactive pricing and no cache hits.
Scenario 1: Solo Developer
100K input + 50K output tokens per day:
| Model | Monthly Cost |
|---|---|
| Gemini 2.5 Flash-Lite | $0.90 |
| DeepSeek V4 Flash | $0.84 |
| Gemini 3.1 Flash-Lite Preview | $3.00 |
| Gemini 2.5 Flash | $4.65 |
| Gemini 3.5 Flash | $18.00 |
| Gemini 2.5 Pro | $18.75 |
| Gemini 3.1 Pro Preview | $24.00 |
| Claude Sonnet 4.6 | $31.50 |
Scenario 2: Startup
1M input + 500K output tokens per day:
| Model | Monthly Cost |
|---|---|
| DeepSeek V4 Flash | $8.40 |
| Gemini 2.5 Flash-Lite | $9.00 |
| Gemini 3.1 Flash-Lite Preview | $30.00 |
| Gemini 2.5 Flash | $46.50 |
| Gemini 3.5 Flash | $180.00 |
| Gemini 2.5 Pro | $187.50 |
| Gemini 3.1 Pro Preview | $240.00 |
| Claude Sonnet 4.6 | $315.00 |
Scenario 3: Enterprise / Production
10M input + 5M output tokens per day:
| Model | Monthly Cost |
|---|---|
| DeepSeek V4 Flash | $84 |
| Gemini 2.5 Flash-Lite | $90 |
| Gemini 3.1 Flash-Lite Preview | $300 |
| Gemini 2.5 Flash | $465 |
| Gemini 3.5 Flash | $1,800 |
| Gemini 2.5 Pro | $1,875 |
| Gemini 3.1 Pro Preview | $2,400 |
| Claude Sonnet 4.6 | $3,150 |
Want exact numbers for your usage? Try our AI Model Pricing Calculator.
When to Choose Gemini
1. Multimodal Agent Workloads
Choose Gemini 3.5 Flash when a single model needs to ingest text, images, video, audio, and PDFs while still supporting function calling, structured outputs, URL context, search grounding, and Batch/Flex options.
2. Long Document Processing
Gemini remains strong when you need to process long documents, large codebases, long videos, or multi-file context in a single request. Use Gemini 3.5 Flash or Pro when quality matters most, and use Flash-Lite or 2.5 Flash when the task is extractive or routing-friendly.
3. Development and Evaluation
Gemini’s free tier is useful for prototyping and testing. For production planning, check active quotas in AI Studio because Google says rate limits are project-specific and not guaranteed.
4. Google Cloud Integration
Teams already using Google Cloud may prefer Gemini through AI Studio or Vertex AI for billing, identity, governance, and integration with Cloud Storage, BigQuery, and other Google services.
Cost Optimization Tips
- Do not default to Gemini 3.5 Flash for cheap text tasks. Use Gemini 2.5 Flash-Lite, DeepSeek V4 Flash, or GPT-4.1 Nano when the quality bar allows it.
- Use Gemini 3.5 Flash when multimodal and grounding features matter. The premium only makes sense if you use the capabilities.
- Use Batch or Flex for offline work. Batch/Flex prices are materially lower for many Gemini models.
- Use context caching for repeated large context. Google lists cached input pricing for Gemini 3.5 Flash and other current models.
- Count tokens before sending. Use our AI Token Counter before sending long prompts or media-derived transcripts.
Bottom Line
Gemini 3.5 Flash is the new stable premium Flash route in Google’s lineup: $1.50/M input, $0.15/M cached input, $9.00/M output, 1,048,576 input tokens, and 65,536 output tokens. It is compelling for multimodal agents, search-grounded workflows, coding loops, and Google ecosystem integrations.
For cost-sensitive text workloads, Gemini 3.5 Flash is not the default budget choice. Start with Gemini 2.5 Flash-Lite, Gemini 2.5 Flash, or DeepSeek V4 Flash, then route harder or more multimodal work to Gemini 3.5 Flash, Gemini 2.5 Pro, or Gemini 3.1 Pro Preview.
Official sources checked: Google Gemini API pricing, Gemini 3.5 Flash model docs, Google Gemini API rate limits, and DeepSeek API pricing.
Related tools and guides:
- AI Model Pricing Calculator - Compare monthly costs across models
- AI Token Counter - Count tokens before making API calls
- Gemini 3.5 Flash vs DeepSeek V4 - Cost and agent-routing comparison
- AI API Pricing Comparison 2026 - Full provider pricing table
- DeepSeek API Pricing Guide 2026 - Budget text alternative
Related Posts
Gemini 3.5 Flash vs DeepSeek V4: API Price, Agents, and When to Use Each
2026-05-24
Gemini 3.1 Pro Pricing: $2.00/$12 per M — 1M Context, Video
2026-02-26
Mistral API Pricing 2026: Small $0.20/M, Large $2/M + Free Tier
2026-02-24