Google Gemini API Pricing 2026: Gemini 3.5 Flash, 3.1 Pro & 2.5 Models

Google’s Gemini API is a strong option when you need long context, multimodal input, Google AI Studio workflow support, search grounding, Batch/Flex discounts, and free development access. This guide is refreshed against src/data/models.ts and Google’s official Gemini API pricing and model docs.

The newest pricing change to account for is Gemini 3.5 Flash. Google lists gemini-3.5-flash as a stable model with 1,048,576 input tokens, 65,536 output tokens, multimodal inputs, standard pricing at $1.50/M input and $9.00/M output, and cached input at $0.15/M.

The older stale claim to avoid remains the same: Gemini 2.5 Flash standard interactive pricing is not $0.15/M input and $0.60/M output. Google currently lists standard Gemini 2.5 Flash text/image/video pricing at $0.30/M input and $2.50/M output. The $0.15/M input tier applies to Batch/Flex, not standard interactive usage.

Gemini 2026 Model Lineup & Pricing

Model	Standard Input	Cached Input	Standard Output	Context	Max Output	Best For
Gemini 3.5 Flash	$1.50/M	$0.15/M	$9.00/M	1,048,576	65,536	Fast frontier-level agents, coding loops, multimodal workloads
Gemini 3.1 Pro Preview	$2.00/M <=200K, $4.00/M >200K	$0.20/M <=200K, $0.40/M >200K	$12.00/M <=200K, $18.00/M >200K	1,048,576	65,536	Highest Gemini Pro tier in canonical data
Gemini 3.1 Flash-Lite Preview	$0.25/M	$0.025/M	$1.50/M	1M	64K	Fast, high-volume agentic and data tasks
Gemini 2.5 Pro	$1.25/M <=200K, $2.50/M >200K	$0.125/M <=200K, $0.25/M >200K	$10.00/M <=200K, $15.00/M >200K	1,048,576	65,536	Complex reasoning, coding, long documents
Gemini 2.5 Flash	$0.30/M text/image/video	$0.03/M text/image/video	$2.50/M	1M	64K	Balanced cost and capability
Gemini 2.5 Flash-Lite	$0.10/M text/image/video	$0.01/M text/image/video	$0.40/M	1M	64K	Lowest-cost current Gemini text route

All prices are USD per 1 million tokens for Google Gemini Developer API standard paid usage. Google lists separate Batch, Flex, and Priority pricing for some models, and project-specific rate limits can differ from public examples.

What Is New in Gemini 3.5 Flash

Gemini 3.5 Flash is not just a price-table update. Google’s model page positions it for sustained frontier-level intelligence at higher speed and lower cost, especially for sub-agent deployment, multi-step workflows, long-horizon tasks, and coding cycles.

Operationally, the important details are:

Field	Gemini 3.5 Flash
API model ID	`gemini-3.5-flash`
Status	Stable
Input types	Text, image, video, audio, PDF
Output type	Text
Input token limit	1,048,576
Output token limit	65,536
Supported capabilities	Batch API, caching, code execution, file search, Flex, function calling, Google Maps grounding, priority inference, search grounding, structured outputs, thinking, URL context
Not supported	Audio generation, computer use, image generation, Live API

For DevTk.AI routing guidance, this makes Gemini 3.5 Flash a good premium Flash route, not the cheapest Gemini route.

Batch, Flex, Priority, and Context Caching

Google’s paid tier includes access to Batch API, Flex inference, and context caching. Batch API is listed as a 50% cost reduction on the pricing overview, and the detailed tables show lower Batch/Flex token rates for many models.

Model	Standard Input	Batch/Flex Input	Standard Output	Batch/Flex Output
Gemini 3.5 Flash	$1.50/M	$0.75/M	$9.00/M	$4.50/M
Gemini 3.1 Pro Preview <=200K	$2.00/M	$1.00/M	$12.00/M	$6.00/M
Gemini 3.1 Flash-Lite Preview	$0.25/M	$0.125/M	$1.50/M	$0.75/M
Gemini 2.5 Pro <=200K	$1.25/M	$0.625/M	$10.00/M	$5.00/M
Gemini 2.5 Flash	$0.30/M	$0.15/M	$2.50/M	$1.25/M
Gemini 2.5 Flash-Lite	$0.10/M	$0.05/M	$0.40/M	$0.20/M

This is the source of the older $0.15 Gemini 2.5 Flash number: it is valid for Batch/Flex input, not standard interactive input.

Gemini vs GPT-5 vs Claude vs DeepSeek

Model	Provider	Input	Cached Input	Output	Notes
Gemini 3.5 Flash	Google	$1.50	$0.15	$9.00	Stable 1M-context multimodal Flash route
Gemini 3.1 Pro Preview	Google	$2.00	$0.20	$12.00	$4/$18 above 200K prompt tokens
Gemini 2.5 Pro	Google	$1.25	$0.125	$10.00	$2.50/$15 above 200K prompt tokens
GPT-5	OpenAI	$1.25	varies	$10.00	Same base price as Gemini 2.5 Pro
Claude Sonnet 4.6	Anthropic	$3.00	varies	$15.00	Higher price, 1M canonical context
Gemini 2.5 Flash	Google	$0.30	$0.03	$2.50	Standard interactive price
Gemini 2.5 Flash-Lite	Google	$0.10	$0.01	$0.40	Lowest-cost current Gemini route
DeepSeek V4 Flash	DeepSeek	$0.14	$0.0028	$0.28	Much cheaper text alternative
DeepSeek V4 Pro	DeepSeek	$0.435	$0.003625	$0.87	Official 1/4-of-original price after permanent cut

Key takeaways:

Gemini 3.5 Flash is a premium Flash route, not a budget route. It costs 5x Gemini 2.5 Flash on input and 3.6x on output.
DeepSeek V4 Flash is dramatically cheaper for text agents at $0.14/$0.28, especially when cache hits are high.
Gemini 3.5 Flash is stronger for multimodal and Google ecosystem workloads because it supports text, image, video, audio, PDF input, search grounding, Maps grounding, URL context, and Google-native tooling.
Gemini 2.5 Flash-Lite remains the lowest-cost Gemini route at $0.10/$0.40 for standard interactive text/image/video input.

Monthly Cost Estimates

These examples assume standard interactive pricing and no cache hits.

Scenario 1: Solo Developer

100K input + 50K output tokens per day:

Model	Monthly Cost
Gemini 2.5 Flash-Lite	$0.90
DeepSeek V4 Flash	$0.84
Gemini 3.1 Flash-Lite Preview	$3.00
Gemini 2.5 Flash	$4.65
Gemini 3.5 Flash	$18.00
Gemini 2.5 Pro	$18.75
Gemini 3.1 Pro Preview	$24.00
Claude Sonnet 4.6	$31.50

Scenario 2: Startup

1M input + 500K output tokens per day:

Model	Monthly Cost
DeepSeek V4 Flash	$8.40
Gemini 2.5 Flash-Lite	$9.00
Gemini 3.1 Flash-Lite Preview	$30.00
Gemini 2.5 Flash	$46.50
Gemini 3.5 Flash	$180.00
Gemini 2.5 Pro	$187.50
Gemini 3.1 Pro Preview	$240.00
Claude Sonnet 4.6	$315.00

Scenario 3: Enterprise / Production

10M input + 5M output tokens per day:

Model	Monthly Cost
DeepSeek V4 Flash	$84
Gemini 2.5 Flash-Lite	$90
Gemini 3.1 Flash-Lite Preview	$300
Gemini 2.5 Flash	$465
Gemini 3.5 Flash	$1,800
Gemini 2.5 Pro	$1,875
Gemini 3.1 Pro Preview	$2,400
Claude Sonnet 4.6	$3,150

Want exact numbers for your usage? Try our AI Model Pricing Calculator.

When to Choose Gemini

1. Multimodal Agent Workloads

Choose Gemini 3.5 Flash when a single model needs to ingest text, images, video, audio, and PDFs while still supporting function calling, structured outputs, URL context, search grounding, and Batch/Flex options.

2. Long Document Processing

Gemini remains strong when you need to process long documents, large codebases, long videos, or multi-file context in a single request. Use Gemini 3.5 Flash or Pro when quality matters most, and use Flash-Lite or 2.5 Flash when the task is extractive or routing-friendly.

3. Development and Evaluation

Gemini’s free tier is useful for prototyping and testing. For production planning, check active quotas in AI Studio because Google says rate limits are project-specific and not guaranteed.

4. Google Cloud Integration

Teams already using Google Cloud may prefer Gemini through AI Studio or Vertex AI for billing, identity, governance, and integration with Cloud Storage, BigQuery, and other Google services.

Cost Optimization Tips

Do not default to Gemini 3.5 Flash for cheap text tasks. Use Gemini 2.5 Flash-Lite, DeepSeek V4 Flash, or GPT-4.1 Nano when the quality bar allows it.
Use Gemini 3.5 Flash when multimodal and grounding features matter. The premium only makes sense if you use the capabilities.
Use Batch or Flex for offline work. Batch/Flex prices are materially lower for many Gemini models.
Use context caching for repeated large context. Google lists cached input pricing for Gemini 3.5 Flash and other current models.
Count tokens before sending. Use our AI Token Counter before sending long prompts or media-derived transcripts.

Bottom Line

Gemini 3.5 Flash is the new stable premium Flash route in Google’s lineup: $1.50/M input, $0.15/M cached input, $9.00/M output, 1,048,576 input tokens, and 65,536 output tokens. It is compelling for multimodal agents, search-grounded workflows, coding loops, and Google ecosystem integrations.

For cost-sensitive text workloads, Gemini 3.5 Flash is not the default budget choice. Start with Gemini 2.5 Flash-Lite, Gemini 2.5 Flash, or DeepSeek V4 Flash, then route harder or more multimodal work to Gemini 3.5 Flash, Gemini 2.5 Pro, or Gemini 3.1 Pro Preview.

Official sources checked: Google Gemini API pricing, Gemini 3.5 Flash model docs, Google Gemini API rate limits, and DeepSeek API pricing.

Related tools and guides:

AI Model Pricing Calculator - Compare monthly costs across models
AI Token Counter - Count tokens before making API calls
Gemini 3.5 Flash vs DeepSeek V4 - Cost and agent-routing comparison
AI API Pricing Comparison 2026 - Full provider pricing table
DeepSeek API Pricing Guide 2026 - Budget text alternative