DevTk.AI
Gemini API PricingGemini 3.5 FlashGoogle AIGemini 2.5 FlashLLM Pricing

Google Gemini API Pricing 2026: Gemini 3.5 Flash, 3.1 Pro & 2.5 Models

Updated May 2026. Current Gemini API pricing for Gemini 3.5 Flash, 3.1 Pro, 3.1 Flash-Lite, 2.5 Pro, 2.5 Flash, caching, Batch/Flex, and DeepSeek V4 comparison.

DevTk.AI 2026-02-24 Updated 2026-05-24 9 min read

Google’s Gemini API is a strong option when you need long context, multimodal input, Google AI Studio workflow support, search grounding, Batch/Flex discounts, and free development access. This guide is refreshed against src/data/models.ts and Google’s official Gemini API pricing and model docs.

The newest pricing change to account for is Gemini 3.5 Flash. Google lists gemini-3.5-flash as a stable model with 1,048,576 input tokens, 65,536 output tokens, multimodal inputs, standard pricing at $1.50/M input and $9.00/M output, and cached input at $0.15/M.

The older stale claim to avoid remains the same: Gemini 2.5 Flash standard interactive pricing is not $0.15/M input and $0.60/M output. Google currently lists standard Gemini 2.5 Flash text/image/video pricing at $0.30/M input and $2.50/M output. The $0.15/M input tier applies to Batch/Flex, not standard interactive usage.

Gemini 2026 Model Lineup & Pricing

ModelStandard InputCached InputStandard OutputContextMax OutputBest For
Gemini 3.5 Flash$1.50/M$0.15/M$9.00/M1,048,57665,536Fast frontier-level agents, coding loops, multimodal workloads
Gemini 3.1 Pro Preview$2.00/M <=200K, $4.00/M >200K$0.20/M <=200K, $0.40/M >200K$12.00/M <=200K, $18.00/M >200K2M16KHighest Gemini Pro tier in canonical data
Gemini 3.1 Flash-Lite Preview$0.25/M$0.025/M$1.50/M1M64KFast, high-volume agentic and data tasks
Gemini 2.5 Pro$1.25/M <=200K, $2.50/M >200K$0.125/M <=200K, $0.25/M >200K$10.00/M <=200K, $15.00/M >200K2M64KComplex reasoning, coding, long documents
Gemini 2.5 Flash$0.30/M text/image/video$0.03/M text/image/video$2.50/M1M64KBalanced cost and capability
Gemini 2.5 Flash-Lite$0.10/M text/image/video$0.01/M text/image/video$0.40/M1M64KLowest-cost current Gemini text route
Gemini 2.0 Flash$0.10/Mn/a$0.40/M1M8KDeprecated legacy apps

All prices are USD per 1 million tokens for Google Gemini Developer API standard paid usage. Google lists separate Batch, Flex, and Priority pricing for some models, and project-specific rate limits can differ from public examples.

What Is New in Gemini 3.5 Flash

Gemini 3.5 Flash is not just a price-table update. Google’s model page positions it for sustained frontier-level intelligence at higher speed and lower cost, especially for sub-agent deployment, multi-step workflows, long-horizon tasks, and coding cycles.

Operationally, the important details are:

FieldGemini 3.5 Flash
API model IDgemini-3.5-flash
StatusStable
Input typesText, image, video, audio, PDF
Output typeText
Input token limit1,048,576
Output token limit65,536
Supported capabilitiesBatch API, caching, code execution, file search, Flex, function calling, Google Maps grounding, priority inference, search grounding, structured outputs, thinking, URL context
Not supportedAudio generation, computer use, image generation, Live API

For DevTk.AI routing guidance, this makes Gemini 3.5 Flash a good premium Flash route, not the cheapest Gemini route.

Batch, Flex, Priority, and Context Caching

Google’s paid tier includes access to Batch API, Flex inference, and context caching. Batch API is listed as a 50% cost reduction on the pricing overview, and the detailed tables show lower Batch/Flex token rates for many models.

ModelStandard InputBatch/Flex InputStandard OutputBatch/Flex Output
Gemini 3.5 Flash$1.50/M$0.75/M$9.00/M$4.50/M
Gemini 3.1 Pro Preview <=200K$2.00/M$1.00/M$12.00/M$6.00/M
Gemini 3.1 Flash-Lite Preview$0.25/M$0.125/M$1.50/M$0.75/M
Gemini 2.5 Pro <=200K$1.25/M$0.625/M$10.00/M$5.00/M
Gemini 2.5 Flash$0.30/M$0.15/M$2.50/M$1.25/M
Gemini 2.5 Flash-Lite$0.10/M$0.05/M$0.40/M$0.20/M

This is the source of the older $0.15 Gemini 2.5 Flash number: it is valid for Batch/Flex input, not standard interactive input.

Gemini vs GPT-5 vs Claude vs DeepSeek

ModelProviderInputCached InputOutputNotes
Gemini 3.5 FlashGoogle$1.50$0.15$9.00Stable 1M-context multimodal Flash route
Gemini 3.1 Pro PreviewGoogle$2.00$0.20$12.00$4/$18 above 200K prompt tokens
Gemini 2.5 ProGoogle$1.25$0.125$10.00$2.50/$15 above 200K prompt tokens
GPT-5OpenAI$1.25varies$10.00Same base price as Gemini 2.5 Pro
Claude Sonnet 4.6Anthropic$3.00varies$15.00Higher price, 1M canonical context
Gemini 2.5 FlashGoogle$0.30$0.03$2.50Standard interactive price
Gemini 2.5 Flash-LiteGoogle$0.10$0.01$0.40Lowest-cost current Gemini route
DeepSeek V4 FlashDeepSeek$0.14$0.0028$0.28Much cheaper text alternative
DeepSeek V4 ProDeepSeek$0.435$0.003625$0.87Official 1/4-of-original price after permanent cut

Key takeaways:

  • Gemini 3.5 Flash is a premium Flash route, not a budget route. It costs 5x Gemini 2.5 Flash on input and 3.6x on output.
  • DeepSeek V4 Flash is dramatically cheaper for text agents at $0.14/$0.28, especially when cache hits are high.
  • Gemini 3.5 Flash is stronger for multimodal and Google ecosystem workloads because it supports text, image, video, audio, PDF input, search grounding, Maps grounding, URL context, and Google-native tooling.
  • Gemini 2.5 Flash-Lite remains the lowest-cost Gemini route at $0.10/$0.40 for standard interactive text/image/video input.

Monthly Cost Estimates

These examples assume standard interactive pricing and no cache hits.

Scenario 1: Solo Developer

100K input + 50K output tokens per day:

ModelMonthly Cost
Gemini 2.5 Flash-Lite$0.90
DeepSeek V4 Flash$0.84
Gemini 3.1 Flash-Lite Preview$3.00
Gemini 2.5 Flash$4.65
Gemini 3.5 Flash$18.00
Gemini 2.5 Pro$18.75
Gemini 3.1 Pro Preview$24.00
Claude Sonnet 4.6$31.50

Scenario 2: Startup

1M input + 500K output tokens per day:

ModelMonthly Cost
DeepSeek V4 Flash$8.40
Gemini 2.5 Flash-Lite$9.00
Gemini 3.1 Flash-Lite Preview$30.00
Gemini 2.5 Flash$46.50
Gemini 3.5 Flash$180.00
Gemini 2.5 Pro$187.50
Gemini 3.1 Pro Preview$240.00
Claude Sonnet 4.6$315.00

Scenario 3: Enterprise / Production

10M input + 5M output tokens per day:

ModelMonthly Cost
DeepSeek V4 Flash$84
Gemini 2.5 Flash-Lite$90
Gemini 3.1 Flash-Lite Preview$300
Gemini 2.5 Flash$465
Gemini 3.5 Flash$1,800
Gemini 2.5 Pro$1,875
Gemini 3.1 Pro Preview$2,400
Claude Sonnet 4.6$3,150

Want exact numbers for your usage? Try our AI Model Pricing Calculator.

When to Choose Gemini

1. Multimodal Agent Workloads

Choose Gemini 3.5 Flash when a single model needs to ingest text, images, video, audio, and PDFs while still supporting function calling, structured outputs, URL context, search grounding, and Batch/Flex options.

2. Long Document Processing

Gemini remains strong when you need to process long documents, large codebases, long videos, or multi-file context in a single request. Use Gemini 3.5 Flash or Pro when quality matters most, and use Flash-Lite or 2.5 Flash when the task is extractive or routing-friendly.

3. Development and Evaluation

Gemini’s free tier is useful for prototyping and testing. For production planning, check active quotas in AI Studio because Google says rate limits are project-specific and not guaranteed.

4. Google Cloud Integration

Teams already using Google Cloud may prefer Gemini through AI Studio or Vertex AI for billing, identity, governance, and integration with Cloud Storage, BigQuery, and other Google services.

Cost Optimization Tips

  1. Do not default to Gemini 3.5 Flash for cheap text tasks. Use Gemini 2.5 Flash-Lite, DeepSeek V4 Flash, or GPT-4.1 Nano when the quality bar allows it.
  2. Use Gemini 3.5 Flash when multimodal and grounding features matter. The premium only makes sense if you use the capabilities.
  3. Use Batch or Flex for offline work. Batch/Flex prices are materially lower for many Gemini models.
  4. Use context caching for repeated large context. Google lists cached input pricing for Gemini 3.5 Flash and other current models.
  5. Count tokens before sending. Use our AI Token Counter before sending long prompts or media-derived transcripts.

Bottom Line

Gemini 3.5 Flash is the new stable premium Flash route in Google’s lineup: $1.50/M input, $0.15/M cached input, $9.00/M output, 1,048,576 input tokens, and 65,536 output tokens. It is compelling for multimodal agents, search-grounded workflows, coding loops, and Google ecosystem integrations.

For cost-sensitive text workloads, Gemini 3.5 Flash is not the default budget choice. Start with Gemini 2.5 Flash-Lite, Gemini 2.5 Flash, or DeepSeek V4 Flash, then route harder or more multimodal work to Gemini 3.5 Flash, Gemini 2.5 Pro, or Gemini 3.1 Pro Preview.

Official sources checked: Google Gemini API pricing, Gemini 3.5 Flash model docs, Google Gemini API rate limits, and DeepSeek API pricing.

Related tools and guides:

Related Posts