Gemini 3.5 Flash
GoogleUpdated May 2026. Gemini 3.5 Flash by Google: $1.50/M cache-miss input, $9.00/M output tokens. Cached input: $0.15/M. 1.0M context, 66K max output. Vision & Video. Free calculator + compare 40+ models.
Input Price
$1.50
cache miss / 1M tokens
Cached Input
$0.15
per 1M tokens
Output Price
$9.00
per 1M tokens
Context Window
1.0M
tokens
Specifications
| Provider | |
| Model ID | gemini-3.5-flash |
| Input Price | $1.5 / 1M cache-miss tokens |
| Cached Input Price | $0.15 / 1M tokens |
| Output Price | $9 / 1M tokens |
| Context Window | 1.0M tokens |
| Max Output | 66K tokens |
| Capabilities | textvisionvideoaudiofunction_callingstructured_output |
| Release Date | 2026-05 |
| Pricing Source | Official Google pricing |
| Price Verified | 2026-05-24 · Gemini 3.5 Flash pricing and limits were refreshed from Google AI docs. Active quotas are project-specific; check AI Studio before production planning. |
| Notes | Stable Gemini 3.5 Flash model for agentic loops, coding cycles, long-horizon tasks, search grounding, Batch API, Flex, context caching, and multimodal inputs. |
Monthly Cost Estimates
Estimated monthly costs based on different daily usage levels (assuming 50% input / 50% output split). Input estimates use cache-miss pricing, so cache-heavy workloads can be lower.
| Daily Tokens | Monthly Cost | Annual Cost |
|---|---|---|
| 10K | $1.57 | $18.90 |
| 50K | $7.88 | $94.50 |
| 100K | $15.75 | $189.00 |
| 500K | $78.75 | $945.00 |
| 1.0M | $157.50 | $1890.00 |
About Gemini 3.5 Flash
Gemini 3.5 Flash is a large language model by Google. It features a 1.0M token context window with up to 66K tokens of output per request. The model supports 6 capabilities: text, vision, video, audio, function_calling, structured_output.
At $1.5 per million cache-miss input tokens and $9 per million output tokens, Gemini 3.5 Flash is positioned as a mid-range option in the Google lineup. Repeated prefix input can be charged at $0.15 per million cached tokens. Use our Token Counter to estimate how many tokens your prompts use, and our Pricing Calculator to compare costs across all models.
Gemini 3.5 Flash Key Details
- Pricing: $1.5/M cache-miss input tokens, $0.15/M cached input tokens, $9/M output tokens
- Context window: 1.0M tokens — one of the largest available
- Max output: 66K tokens per response
- Capabilities: text, vision, video, audio, function_calling, structured_output
- Highlights: Stable Gemini 3.5 Flash model for agentic loops, coding cycles, long-horizon tasks, search grounding, Batch API, Flex, context caching, and multimodal inputs.
- Released: 2026-05
Other Google Models
Similar Price Range
Related Tools
FAQ
How much does Gemini 3.5 Flash cost?
Gemini 3.5 Flash costs $1.5 per million cache-miss input tokens and $9 per million output tokens. Cached input costs $0.15 per million tokens. For a typical workload of 100K input tokens/day and 50K output tokens/day, expect approximately $18.00/month before cache-hit savings.
What is Gemini 3.5 Flash's context window?
Gemini 3.5 Flash supports a context window of 1.0M tokens. This means your combined input prompt and output response can be up to 1.0M tokens. The maximum output per response is 66K tokens.
Is Gemini 3.5 Flash good for my use case?
Gemini 3.5 Flash supports text, vision, video, audio, function_calling, structured_output. As a mid-range model, it balances capability and cost for most production use cases. Use our Pricing Calculator to compare with alternatives.