DevTk.AI

AI Model Pricing Calculator

Compare costs across 40+ AI models, including DeepSeek V4 cache-hit pricing. Estimate your monthly spend instantly.

Workload presets

Cache-hit share is applied only to models with published cached-input pricing.

100,000 tokens/day = 3,000,000 tokens/month

50,000 tokens/day = 1,500,000 tokens/month

0% of input tokens use cached pricing when available

Filter by Provider
Showing 48 of 48 modelsPrices updated: 2026-05-06 · cached input share: 0%
Amazon Nova MicroCheapest
Amazon
$0.3150
/month
Input $/1M
$0.035
Output $/1M
$0.14
Context
128,000
Amazon Nova Lite
Amazon
$0.5400
/month
Input $/1M
$0.06
Output $/1M
$0.24
Context
300,000
Xiaomi MiMo-V2.5-Flash
Xiaomi MiMo
$0.7500
/month
Input $/1M
$0.1
cached $0.01
Output $/1M
$0.3
Context
256,000
GPT-5 Nano
OpenAI
$0.7500
/month
Input $/1M
$0.05
Output $/1M
$0.4
Context
128,000
DeepSeek V4 Flash
DeepSeek
$0.8400
/month
Input $/1M
$0.14
cached $0.0028
Output $/1M
$0.28
Context
1,000,000
Gemini 2.5 Flash-Lite
Google
$0.9000
/month
Input $/1M
$0.1
Output $/1M
$0.4
Context
1,000,000
Gemini 2.0 Flash
Google
$0.9000
/month
Input $/1M
$0.1
Output $/1M
$0.4
Context
1,000,000
Jamba 1.5 Mini
AI21 Labs
$1.20
/month
Input $/1M
$0.2
Output $/1M
$0.4
Context
256,000
GPT-4o mini
OpenAI
$1.35
/month
Input $/1M
$0.15
Output $/1M
$0.6
Context
128,000
Command R
Cohere
$1.35
/month
Input $/1M
$0.15
Output $/1M
$0.6
Context
128,000
Grok 4 Fast
xAI
$1.35
/month
Input $/1M
$0.2
Output $/1M
$0.5
Context
2,000,000
Mistral Small 3.1
Mistral
$1.50
/month
Input $/1M
$0.2
Output $/1M
$0.6
Context
128,000
Qwen 2.5 Coder 32B
Alibaba
$1.50
/month
Input $/1M
$0.2
Output $/1M
$0.6
Context
128,000
Grok 3 Mini
xAI
$1.65
/month
Input $/1M
$0.3
Output $/1M
$0.5
Context
131,072
DeepSeek V3.2 (legacy)
DeepSeek
$2.46
/month
Input $/1M
$0.27
Output $/1M
$1.10
Context
128,000
DeepSeek V4 Pro
DeepSeek
$2.61
/month
Input $/1M
$0.435
cached $0.003625
Output $/1M
$0.87
Context
1,000,000
Gemini 3.1 Flash-Lite
Google
$3.00
/month
Input $/1M
$0.25
Output $/1M
$1.50
Context
1,000,000
Qwen 2.5 72B
Alibaba
$3.00
/month
Input $/1M
$0.4
Output $/1M
$1.20
Context
128,000
GPT-5 Mini
OpenAI
$3.75
/month
Input $/1M
$0.25
Output $/1M
$2.00
Context
400,000
$3.96
/month
Input $/1M
$0.88
Output $/1M
$0.88
Context
128,000
Xiaomi MiMo-V2.5
Xiaomi MiMo
$4.20
/month
Input $/1M
$0.4
cached $0.08
Output $/1M
$2.00
Context
1,000,000
Gemini 2.5 Flash
Google
$4.65
/month
Input $/1M
$0.3
Output $/1M
$2.50
Context
1,000,000
Kimi K2.5
Moonshot AI
$4.80
/month
Input $/1M
$0.6
Output $/1M
$2.00
Context
128,000
DeepSeek R1 (legacy)
DeepSeek
$4.94
/month
Input $/1M
$0.55
Output $/1M
$2.19
Context
128,000
Amazon Nova Pro
Amazon
$7.20
/month
Input $/1M
$0.8
Output $/1M
$3.20
Context
300,000
Mistral Medium 3
Mistral
$7.50
/month
Input $/1M
$1.00
Output $/1M
$3.00
Context
128,000
Xiaomi MiMo-V2.5-Pro
Xiaomi MiMo
$7.50
/month
Input $/1M
$1.00
cached $0.2
Output $/1M
$3.00
Context
1,000,000
o3-mini
OpenAI
$9.90
/month
Input $/1M
$1.10
Output $/1M
$4.40
Context
200,000
Claude Haiku 4.5
Anthropic
$10.50
/month
Input $/1M
$1.00
Output $/1M
$5.00
Context
200,000
Mistral Large 3
Mistral
$15.00
/month
Input $/1M
$2.00
Output $/1M
$6.00
Context
128,000
Llama 3.1 405B
Meta (via providers)
$15.75
/month
Input $/1M
$3.50
Output $/1M
$3.50
Context
128,000
$18.00
/month
Input $/1M
$2.00
Output $/1M
$8.00
Context
200,000
Jamba 1.5 Large
AI21 Labs
$18.00
/month
Input $/1M
$2.00
Output $/1M
$8.00
Context
256,000
GPT-5
OpenAI
$18.75
/month
Input $/1M
$1.25
Output $/1M
$10.00
Context
400,000
Gemini 2.5 Pro
Google
$18.75
/month
Input $/1M
$1.25
Output $/1M
$10.00
Context
2,000,000
GPT-4o
OpenAI
$22.50
/month
Input $/1M
$2.50
Output $/1M
$10.00
Context
128,000
Command R+
Cohere
$22.50
/month
Input $/1M
$2.50
Output $/1M
$10.00
Context
128,000
Gemini 3.1 Pro
Google
$24.00
/month
Input $/1M
$2.00
Output $/1M
$12.00
Context
2,000,000
GPT-5.2-Codex
OpenAI
$26.25
/month
Input $/1M
$1.75
cached $0.175
Output $/1M
$14.00
Context
400,000
GPT-5.4
OpenAI
$30.00
/month
Input $/1M
$2.50
cached $0.25
Output $/1M
$15.00
Context
1,050,000
Claude Sonnet 4.6
Anthropic
$31.50
/month
Input $/1M
$3.00
Output $/1M
$15.00
Context
1,000,000
Grok 3
xAI
$31.50
/month
Input $/1M
$3.00
Output $/1M
$15.00
Context
131,072
Grok 4
xAI
$31.50
/month
Input $/1M
$3.00
Output $/1M
$15.00
Context
256,000
Claude Opus 4.6
Anthropic
$52.50
/month
Input $/1M
$5.00
Output $/1M
$25.00
Context
1,000,000
GPT-5.5
OpenAI
$60.00
/month
Input $/1M
$5.00
cached $0.5
Output $/1M
$30.00
Context
1,050,000
$135.00
/month
Input $/1M
$15.00
Output $/1M
$60.00
Context
200,000
o3-pro
OpenAI
$180.00
/month
Input $/1M
$20.00
Output $/1M
$80.00
Context
200,000
GPT-5.5 Pro
OpenAI
$360.00
/month
Input $/1M
$30.00
Output $/1M
$180.00
Context
1,050,000

How to Use This Tool

  1. Enter your estimated daily input tokens (the text you send to the AI) and daily output tokens (the AI's response length).
  2. Use the provider filter to narrow results to specific providers like OpenAI, Anthropic, Google, or others.
  3. Sort by monthly cost, input price, or output price to find the most cost-effective model for your use case.
  4. Click on a provider name to visit their official pricing page and sign up for API access.
  5. Compare multiple models side-by-side to find the best price-to-performance ratio for your specific workload.

Understanding AI API Pricing in 2026

AI API pricing is based on tokens processed, with separate rates for input tokens (your prompts) and output tokens (the model's responses). Prices are typically quoted per million tokens. For example, GPT-4o charges $2.50 per million input tokens and $10.00 per million output tokens, while Claude Sonnet 4 charges $3.00 and $15.00 respectively.

The AI pricing landscape has become increasingly competitive in 2026. DeepSeek V4 Flash is especially aggressive for cache-heavy agent workloads, while premium models like GPT-5 and Claude Opus 4 offer stronger reasoning at higher price points. The right choice depends on task complexity, cache-hit rate, and output length.

Several factors beyond per-token pricing affect your total cost: Batch API discounts, prompt caching or provider-side context caching, and context window usage. DeepSeek V4, Anthropic, OpenAI, and Xiaomi MiMo all expose cached-input economics in different ways, so a naive cache-miss estimate can overstate real agent bills.

For cost optimization, consider these strategies: Use smaller models for simple tasks (GPT-4o-mini or Claude Haiku for classification, summarization). Use prompt caching for system prompts that don't change. Batch non-urgent requests for 50% savings. Monitor token usage with observability tools like Helicone or Langfuse.

Last updated: April 2026

FAQ

How is the monthly cost calculated?

Monthly cost = (daily input tokens × input price per token × 30) + (daily output tokens × output price per token × 30). Prices are based on the latest published API pricing.

How often are prices updated?

We update pricing data regularly. The last update date is shown on the page. AI model prices change frequently, so always verify with the provider's official pricing page.

Which model is cheapest?

It depends on your use case. For cache-heavy agent traffic, DeepSeek V4 Flash can be extremely cheap. For simple tasks, GPT-4o-mini and Claude Haiku offer strong value. For complex reasoning, larger models like GPT-5, Claude Sonnet, or DeepSeek V4 Pro may be more cost-effective despite higher per-token costs.

What is Batch API pricing?

Several providers offer Batch API pricing at ~50% discount for requests that don't need real-time responses. OpenAI's Batch API, Anthropic's Message Batches, and Google's batch endpoints all offer significant savings for bulk processing like document analysis, data extraction, or content generation jobs.

How does prompt caching reduce costs?

Prompt caching and context caching store repeated prompt prefixes on the provider side. DeepSeek V4 Flash is the clearest example today: cache-hit input is $0.0028/M versus $0.14/M for cache misses. This is especially valuable for coding agents that repeatedly send repository rules, system prompts, and stable context.

Which model offers the best value in 2026?

It depends on your use case. DeepSeek V4 Flash is strong for cache-heavy agent workloads and high-volume text tasks. Smaller models are better for simple extraction, while GPT-5, Claude Sonnet, or DeepSeek V4 Pro can be worth the extra cost for harder reasoning and coding.

Related Blog Posts

Related Tools