Gemini 2.5 Flash-Lite

Q: What capabilities does Gemini 2.5 Flash-Lite have?

Gemini 2.5 Flash-Lite supports: text, vision, audio, function_calling, structured_output.

Google

Updated June 2026. Gemini 2.5 Flash-Lite by Google: $0.1/M cache-miss input, $0.4/M output tokens. Cached input: $0.01/M. 1.0M context, 66K max output. Vision & Audio. Free calculator + compare 40+ models.

Input Price

$0.1

cache miss / 1M tokens

Cached Input

$0.01

per 1M tokens

Output Price

$0.4

per 1M tokens

Context Window

1.0M

tokens

Specifications

Provider	Google
Model ID	gemini-2-5-flash-lite
Input Price	$0.1 / 1M cache-miss tokens
Cached Input Price	$0.01 / 1M tokens
Output Price	$0.4 / 1M tokens
Context Window	1.0M tokens
Max Output	66K tokens
Capabilities	textvisionaudiofunction_callingstructured_output
Release Date	2025-09
Pricing Source	Official Google pricing
Price Verified	2026-06-14 · Gemini 3.5 Flash pricing and limits were refreshed from Google AI docs. Active quotas are project-specific; check AI Studio before production planning.

Monthly Cost Estimates

Estimated monthly costs based on different daily usage levels (assuming 50% input / 50% output split). Input estimates use cache-miss pricing, so cache-heavy workloads can be lower.

Daily Tokens	Monthly Cost	Annual Cost
10K	$0.07	$0.90
50K	$0.38	$4.50
100K	$0.75	$9.00
500K	$3.75	$45.00
1.0M	$7.50	$90.00

About Gemini 2.5 Flash-Lite

Gemini 2.5 Flash-Lite is a large language model by Google. It features a 1.0M token context window with up to 66K tokens of output per request. The model supports 5 capabilities: text, vision, audio, function_calling, structured_output.

At $0.1 per million cache-miss input tokens and $0.4 per million output tokens, Gemini 2.5 Flash-Lite is positioned as a cost-effective option in the Google lineup. Repeated prefix input can be charged at $0.01 per million cached tokens. Use our Token Counter to estimate how many tokens your prompts use, and our Pricing Calculator to compare costs across all models.

Gemini 2.5 Flash-Lite Key Details

Pricing: $0.1/M cache-miss input tokens, $0.01/M cached input tokens, $0.4/M output tokens
Context window: 1.0M tokens — one of the largest available
Max output: 66K tokens per response
Capabilities: text, vision, audio, function_calling, structured_output
Released: 2025-09

Other Google Models

Gemini 3.5 Flash

$1.5 / $9 per 1M · cached input $0.15

Gemini 3.1 Pro Preview

$2 / $12 per 1M · cached input $0.2

Gemini 3.1 Flash-Lite

$0.25 / $1.5 per 1M · cached input $0.025

Similar Price Range

Mistral Small 4

Mistral

$0.1 / $0.3 per 1M

DeepSeek V4 Flash

DeepSeek

$0.14 / $0.28 per 1M · cached input $0.0028

Xiaomi MiMo-V2.5

Xiaomi MiMo

$0.14 / $0.28 per 1M · cached input $0.0028

Related Tools

AI Token Counter

Count tokens for Gemini 2.5 Flash-Lite

Pricing Calculator

Compare all model prices

Throughput Planner

Plan RPM, TPM, and monthly cost for Gemini 2.5 Flash-Lite

FAQ

How much does Gemini 2.5 Flash-Lite cost?

Gemini 2.5 Flash-Lite costs $0.1 per million cache-miss input tokens and $0.4 per million output tokens. Cached input costs $0.01 per million tokens. For a typical workload of 100K input tokens/day and 50K output tokens/day, expect approximately $0.90/month before cache-hit savings.

What is Gemini 2.5 Flash-Lite's context window?

Gemini 2.5 Flash-Lite supports a context window of 1.0M tokens. This means your combined input prompt and output response can be up to 1.0M tokens. The maximum output per response is 66K tokens.

Is Gemini 2.5 Flash-Lite good for my use case?

Gemini 2.5 Flash-Lite supports text, vision, audio, function_calling, structured_output. As a budget-friendly model, it works well for high-volume tasks like classification, summarization, and simple generation. Use our Pricing Calculator to compare with alternatives.