AI Model Pricing Calculator
Compare costs across 40+ AI models, including DeepSeek V4 cache-hit pricing. Estimate your monthly spend instantly.
Workload presets
Cache-hit share is applied only to models with published cached-input pricing.
100,000 tokens/day = 3,000,000 tokens/month
50,000 tokens/day = 1,500,000 tokens/month
0% of input tokens use cached pricing when available
| Model↑↓ | Provider | Input $/1M↑↓ | Output $/1M↑↓ | Context Window | Monthly Cost↑ |
|---|---|---|---|---|---|
Amazon Nova MicroCheapest | Amazon | $0.035 | $0.14 | 128,000 | $0.3150 |
Amazon Nova Lite | Amazon | $0.06 | $0.24 | 300,000 | $0.5400 |
Xiaomi MiMo-V2.5-Flash | Xiaomi MiMo | $0.1 cached $0.01 | $0.3 | 256,000 | $0.7500 |
GPT-5 Nano | OpenAI | $0.05 | $0.4 | 128,000 | $0.7500 |
DeepSeek V4 Flash | DeepSeek | $0.14 cached $0.0028 | $0.28 | 1,000,000 | $0.8400 |
Gemini 2.5 Flash-Lite | $0.1 | $0.4 | 1,000,000 | $0.9000 | |
Gemini 2.0 Flash | $0.1 | $0.4 | 1,000,000 | $0.9000 | |
Jamba 1.5 Mini | AI21 Labs | $0.2 | $0.4 | 256,000 | $1.20 |
GPT-4o mini | OpenAI | $0.15 | $0.6 | 128,000 | $1.35 |
Command R | Cohere | $0.15 | $0.6 | 128,000 | $1.35 |
Grok 4 Fast | xAI | $0.2 | $0.5 | 2,000,000 | $1.35 |
Mistral Small 3.1 | Mistral | $0.2 | $0.6 | 128,000 | $1.50 |
Qwen 2.5 Coder 32B | Alibaba | $0.2 | $0.6 | 128,000 | $1.50 |
Grok 3 Mini | xAI | $0.3 | $0.5 | 131,072 | $1.65 |
DeepSeek V3.2 (legacy) | DeepSeek | $0.27 | $1.10 | 128,000 | $2.46 |
DeepSeek V4 Pro | DeepSeek | $0.435 cached $0.003625 | $0.87 | 1,000,000 | $2.61 |
Gemini 3.1 Flash-Lite | $0.25 | $1.50 | 1,000,000 | $3.00 | |
Qwen 2.5 72B | Alibaba | $0.4 | $1.20 | 128,000 | $3.00 |
GPT-5 Mini | OpenAI | $0.25 | $2.00 | 400,000 | $3.75 |
Llama 3.3 70B | Meta (via providers) | $0.88 | $0.88 | 128,000 | $3.96 |
Xiaomi MiMo-V2.5 | Xiaomi MiMo | $0.4 cached $0.08 | $2.00 | 1,000,000 | $4.20 |
Gemini 2.5 Flash | $0.3 | $2.50 | 1,000,000 | $4.65 | |
Kimi K2.5 | Moonshot AI | $0.6 | $2.00 | 128,000 | $4.80 |
DeepSeek R1 (legacy) | DeepSeek | $0.55 | $2.19 | 128,000 | $4.94 |
Amazon Nova Pro | Amazon | $0.8 | $3.20 | 300,000 | $7.20 |
Mistral Medium 3 | Mistral | $1.00 | $3.00 | 128,000 | $7.50 |
Xiaomi MiMo-V2.5-Pro | Xiaomi MiMo | $1.00 cached $0.2 | $3.00 | 1,000,000 | $7.50 |
o3-mini | OpenAI | $1.10 | $4.40 | 200,000 | $9.90 |
Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 | 200,000 | $10.50 |
Mistral Large 3 | Mistral | $2.00 | $6.00 | 128,000 | $15.00 |
Llama 3.1 405B | Meta (via providers) | $3.50 | $3.50 | 128,000 | $15.75 |
o3 | OpenAI | $2.00 | $8.00 | 200,000 | $18.00 |
Jamba 1.5 Large | AI21 Labs | $2.00 | $8.00 | 256,000 | $18.00 |
GPT-5 | OpenAI | $1.25 | $10.00 | 400,000 | $18.75 |
Gemini 2.5 Pro | $1.25 | $10.00 | 2,000,000 | $18.75 | |
GPT-4o | OpenAI | $2.50 | $10.00 | 128,000 | $22.50 |
Command R+ | Cohere | $2.50 | $10.00 | 128,000 | $22.50 |
Gemini 3.1 Pro | $2.00 | $12.00 | 2,000,000 | $24.00 | |
GPT-5.2-Codex | OpenAI | $1.75 cached $0.175 | $14.00 | 400,000 | $26.25 |
GPT-5.4 | OpenAI | $2.50 cached $0.25 | $15.00 | 1,050,000 | $30.00 |
Claude Sonnet 4.6 | Anthropic | $3.00 | $15.00 | 1,000,000 | $31.50 |
Grok 3 | xAI | $3.00 | $15.00 | 131,072 | $31.50 |
Grok 4 | xAI | $3.00 | $15.00 | 256,000 | $31.50 |
Claude Opus 4.6 | Anthropic | $5.00 | $25.00 | 1,000,000 | $52.50 |
GPT-5.5 | OpenAI | $5.00 cached $0.5 | $30.00 | 1,050,000 | $60.00 |
o1 | OpenAI | $15.00 | $60.00 | 200,000 | $135.00 |
o3-pro | OpenAI | $20.00 | $80.00 | 200,000 | $180.00 |
GPT-5.5 Pro | OpenAI | $30.00 | $180.00 | 1,050,000 | $360.00 |
How to Use This Tool
- Enter your estimated daily input tokens (the text you send to the AI) and daily output tokens (the AI's response length).
- Use the provider filter to narrow results to specific providers like OpenAI, Anthropic, Google, or others.
- Sort by monthly cost, input price, or output price to find the most cost-effective model for your use case.
- Click on a provider name to visit their official pricing page and sign up for API access.
- Compare multiple models side-by-side to find the best price-to-performance ratio for your specific workload.
Understanding AI API Pricing in 2026
AI API pricing is based on tokens processed, with separate rates for input tokens (your prompts) and output tokens (the model's responses). Prices are typically quoted per million tokens. For example, GPT-4o charges $2.50 per million input tokens and $10.00 per million output tokens, while Claude Sonnet 4 charges $3.00 and $15.00 respectively.
The AI pricing landscape has become increasingly competitive in 2026. DeepSeek V4 Flash is especially aggressive for cache-heavy agent workloads, while premium models like GPT-5 and Claude Opus 4 offer stronger reasoning at higher price points. The right choice depends on task complexity, cache-hit rate, and output length.
Several factors beyond per-token pricing affect your total cost: Batch API discounts, prompt caching or provider-side context caching, and context window usage. DeepSeek V4, Anthropic, OpenAI, and Xiaomi MiMo all expose cached-input economics in different ways, so a naive cache-miss estimate can overstate real agent bills.
For cost optimization, consider these strategies: Use smaller models for simple tasks (GPT-4o-mini or Claude Haiku for classification, summarization). Use prompt caching for system prompts that don't change. Batch non-urgent requests for 50% savings. Monitor token usage with observability tools like Helicone or Langfuse.
Last updated: April 2026
FAQ
How is the monthly cost calculated?
Monthly cost = (daily input tokens × input price per token × 30) + (daily output tokens × output price per token × 30). Prices are based on the latest published API pricing.
How often are prices updated?
We update pricing data regularly. The last update date is shown on the page. AI model prices change frequently, so always verify with the provider's official pricing page.
Which model is cheapest?
It depends on your use case. For cache-heavy agent traffic, DeepSeek V4 Flash can be extremely cheap. For simple tasks, GPT-4o-mini and Claude Haiku offer strong value. For complex reasoning, larger models like GPT-5, Claude Sonnet, or DeepSeek V4 Pro may be more cost-effective despite higher per-token costs.
What is Batch API pricing?
Several providers offer Batch API pricing at ~50% discount for requests that don't need real-time responses. OpenAI's Batch API, Anthropic's Message Batches, and Google's batch endpoints all offer significant savings for bulk processing like document analysis, data extraction, or content generation jobs.
How does prompt caching reduce costs?
Prompt caching and context caching store repeated prompt prefixes on the provider side. DeepSeek V4 Flash is the clearest example today: cache-hit input is $0.0028/M versus $0.14/M for cache misses. This is especially valuable for coding agents that repeatedly send repository rules, system prompts, and stable context.
Which model offers the best value in 2026?
It depends on your use case. DeepSeek V4 Flash is strong for cache-heavy agent workloads and high-volume text tasks. Smaller models are better for simple extraction, while GPT-5, Claude Sonnet, or DeepSeek V4 Pro can be worth the extra cost for harder reasoning and coding.
Related Blog Posts
Official DeepSeek V4 Flash and V4 Pro pricing, cache-hit math, free tier notes, and agent cost examples.
Gemini 3.1 Pro Pricing: $2.00/M Input — ARC-AGI-2 77.1%, 1M Context (2026)Google's latest flagship at $2.00/M input. 77.1% ARC-AGI-2, native video, 1M context window.
GPT-5.5 in Codex Pricing: API Costs, Model IDs, and DeepSeek RoutingOpenAI GPT-5.5, GPT-5.2-Codex, and DeepSeek V4 Flash cost comparison for agent workflows.
AI API Pricing 2026: DeepSeek V4, GPT-5.5, Gemini 3.1, Claude 4.6Side-by-side pricing comparison of all major AI API providers.
How to Reduce AI API Costs: 10 Proven StrategiesPractical tips to cut your AI API spend by 50-90%.