AI Model Pricing Calculator
Compare costs across 30+ AI models. Estimate your monthly spend instantly.
100,000 tokens/day = 3,000,000 tokens/month
50,000 tokens/day = 1,500,000 tokens/month
| Model↑↓ | Provider | Input $/1M↑↓ | Output $/1M↑↓ | Context Window | Monthly Cost↑ |
|---|---|---|---|---|---|
Amazon Nova MicroCheapest | Amazon | $0.04 | $0.14 | 128,000 | $0.3150 |
Amazon Nova Lite | Amazon | $0.06 | $0.24 | 300,000 | $0.5400 |
GPT-5 Nano | OpenAI | $0.05 | $0.40 | 128,000 | $0.7500 |
Gemini 2.5 Flash-Lite | $0.10 | $0.40 | 1,000,000 | $0.9000 | |
Gemini 2.0 Flash | $0.10 | $0.40 | 1,000,000 | $0.9000 | |
Jamba 1.5 Mini | AI21 Labs | $0.20 | $0.40 | 256,000 | $1.20 |
GPT-4o mini | OpenAI | $0.15 | $0.60 | 128,000 | $1.35 |
Command R | Cohere | $0.15 | $0.60 | 128,000 | $1.35 |
Grok 4 Fast | xAI | $0.20 | $0.50 | 2,000,000 | $1.35 |
Mistral Small 3.1 | Mistral | $0.20 | $0.60 | 128,000 | $1.50 |
Qwen 2.5 Coder 32B | Alibaba | $0.20 | $0.60 | 128,000 | $1.50 |
Grok 3 Mini | xAI | $0.30 | $0.50 | 131,072 | $1.65 |
DeepSeek V4 | DeepSeek | $0.30 | $0.50 | 1,000,000 | $1.65 |
DeepSeek V3.2 | DeepSeek | $0.27 | $1.10 | 128,000 | $2.46 |
Gemini 3.1 Flash-Lite | $0.25 | $1.50 | 1,000,000 | $3.00 | |
Qwen 2.5 72B | Alibaba | $0.40 | $1.20 | 128,000 | $3.00 |
GPT-5 Mini | OpenAI | $0.25 | $2.00 | 400,000 | $3.75 |
Llama 3.3 70B | Meta (via providers) | $0.88 | $0.88 | 128,000 | $3.96 |
Gemini 2.5 Flash | $0.30 | $2.50 | 1,000,000 | $4.65 | |
Kimi K2.5 | Moonshot AI | $0.60 | $2.00 | 128,000 | $4.80 |
DeepSeek R1 | DeepSeek | $0.55 | $2.19 | 128,000 | $4.94 |
Amazon Nova Pro | Amazon | $0.80 | $3.20 | 300,000 | $7.20 |
Mistral Medium 3 | Mistral | $1.00 | $3.00 | 128,000 | $7.50 |
o3-mini | OpenAI | $1.10 | $4.40 | 200,000 | $9.90 |
Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 | 200,000 | $10.50 |
Mistral Large 3 | Mistral | $2.00 | $6.00 | 128,000 | $15.00 |
Llama 3.1 405B | Meta (via providers) | $3.50 | $3.50 | 128,000 | $15.75 |
o3 | OpenAI | $2.00 | $8.00 | 200,000 | $18.00 |
Jamba 1.5 Large | AI21 Labs | $2.00 | $8.00 | 256,000 | $18.00 |
GPT-5 | OpenAI | $1.25 | $10.00 | 400,000 | $18.75 |
Gemini 2.5 Pro | $1.25 | $10.00 | 2,000,000 | $18.75 | |
GPT-5.3-Codex | OpenAI | $2.00 | $10.00 | 200,000 | $21.00 |
GPT-4o | OpenAI | $2.50 | $10.00 | 128,000 | $22.50 |
Command R+ | Cohere | $2.50 | $10.00 | 128,000 | $22.50 |
Gemini 3.1 Pro | $2.00 | $12.00 | 2,000,000 | $24.00 | |
GPT-5.4 | OpenAI | $2.50 | $15.00 | 1,100,000 | $30.00 |
Claude Sonnet 4.6 | Anthropic | $3.00 | $15.00 | 1,000,000 | $31.50 |
Grok 3 | xAI | $3.00 | $15.00 | 131,072 | $31.50 |
Grok 4 | xAI | $3.00 | $15.00 | 256,000 | $31.50 |
Claude Opus 4.6 | Anthropic | $5.00 | $25.00 | 1,000,000 | $52.50 |
o1 | OpenAI | $15.00 | $60.00 | 200,000 | $135.00 |
o3-pro | OpenAI | $20.00 | $80.00 | 200,000 | $180.00 |
How to Use This Tool
- Enter your estimated daily input tokens (the text you send to the AI) and daily output tokens (the AI's response length).
- Use the provider filter to narrow results to specific providers like OpenAI, Anthropic, Google, or others.
- Sort by monthly cost, input price, or output price to find the most cost-effective model for your use case.
- Click on a provider name to visit their official pricing page and sign up for API access.
- Compare multiple models side-by-side to find the best price-to-performance ratio for your specific workload.
Understanding AI API Pricing in 2026
AI API pricing is based on tokens processed, with separate rates for input tokens (your prompts) and output tokens (the model's responses). Prices are typically quoted per million tokens. For example, GPT-4o charges $2.50 per million input tokens and $10.00 per million output tokens, while Claude Sonnet 4 charges $3.00 and $15.00 respectively.
The AI pricing landscape has become increasingly competitive in 2026. Open-source models like DeepSeek V3 and Llama 3.3 are available at significantly lower costs through providers like Together.ai and Groq. Meanwhile, premium models like GPT-5 and Claude Opus 4 offer superior reasoning at higher price points. The right choice depends on your task complexity and budget.
Several factors beyond per-token pricing affect your total cost: Batch API discounts (typically 50% off for non-real-time processing), prompt caching (reduced costs for repeated prompt prefixes), and context window usage (longer conversations cost more). Some providers also charge differently for cached inputs versus fresh inputs.
For cost optimization, consider these strategies: Use smaller models for simple tasks (GPT-4o-mini or Claude Haiku for classification, summarization). Use prompt caching for system prompts that don't change. Batch non-urgent requests for 50% savings. Monitor token usage with observability tools like Helicone or Langfuse.
Last updated: February 2026
FAQ
How is the monthly cost calculated?
Monthly cost = (daily input tokens × input price per token × 30) + (daily output tokens × output price per token × 30). Prices are based on the latest published API pricing.
How often are prices updated?
We update pricing data regularly. The last update date is shown on the page. AI model prices change frequently, so always verify with the provider's official pricing page.
Which model is cheapest?
It depends on your use case. For simple tasks, GPT-4o-mini and Claude 3.5 Haiku offer excellent price/performance. For complex reasoning, larger models like GPT-4o or Claude 3.5 Sonnet may be more cost-effective despite higher per-token costs.
What is Batch API pricing?
Several providers offer Batch API pricing at ~50% discount for requests that don't need real-time responses. OpenAI's Batch API, Anthropic's Message Batches, and Google's batch endpoints all offer significant savings for bulk processing like document analysis, data extraction, or content generation jobs.
How does prompt caching reduce costs?
Prompt caching (available from Anthropic and OpenAI) stores frequently used prompt prefixes on the provider's servers. When you send a request with a cached prefix, you pay a reduced rate (typically 75-90% less) for those cached tokens. This is especially valuable for applications with long system prompts or few-shot examples that repeat across requests.
Which model offers the best value in 2026?
It depends on your use case. For simple tasks (classification, extraction), GPT-4o-mini and Claude Haiku 3.5 offer excellent value at under $1 per million tokens. For complex reasoning, Claude Sonnet 4 and GPT-4o balance capability with cost. For maximum quality regardless of price, GPT-5 and Claude Opus 4 lead the market.
Related Blog Posts
Google's latest flagship at $2.00/M input. 77.1% ARC-AGI-2, native video, 1M context window.
GPT-5.3 Codex Pricing: $2/M Input — OpenAI's Agentic Coding Model (2026)OpenAI's coding-optimized model at $2/$10 per 1M tokens. 200K context, 32K max output.
AI API Pricing 2026: GPT-5.3 Codex, Gemini 3.1, Claude 4.6 — Full Price TableSide-by-side pricing comparison of all major AI API providers.
How to Reduce AI API Costs: 10 Proven StrategiesPractical tips to cut your AI API spend by 50-90%.