DevTk.AI

AI Model Pricing Directory

Browse pricing for 40+ AI models including GPT-5, Claude, Gemini, DeepSeek, Llama & more. Compare input/output costs, context windows, and capabilities.

Last updated: 2026-06-14 · 47 models from 14 providers

7 provider sources need re-verification: Meta, Meta (via providers), Alibaba Cloud, Cohere, AI21 Labs, AWS Bedrock, Amazon.

47

Models

14

Providers

$0.035

Cheapest Input/M

1.1M

Max Context

Need to estimate costs?

Use our interactive Pricing Calculator to compare models side-by-side and estimate monthly costs.

Open Pricing Calculator

OpenAI

12 models Verified 2026-06-14
Model Input $/M Output $/M Context
GPT-5.5

Current OpenAI frontier model for complex reasoning, coding, and professional work. GPT-5.5 is also available in Codex; prompts above 272K input tokens use higher long-context pricing.

$5.00 $30.00 1.1M
GPT-5.5 Pro

Higher-compute GPT-5.5 variant for the hardest professional tasks. No cached-input discount is listed on the official model page.

$30.00 $180.00 1.1M
GPT-5.4

More affordable OpenAI model for coding and professional work. GPT-5.5 is now the frontier option.

$2.50 $15.00 1.1M
GPT-5 $1.25 $10.00 400K
GPT-5 Mini $0.25 $2.00 400K
GPT-5 Nano $0.05 $0.4 400K
GPT-4o mini $0.15 $0.6 128K
o3-pro

Highest reasoning capability for elite tasks.

$20.00 $80.00 200K
o3

Standard reasoning model.

$2.00 $8.00 200K
GPT-5.4 Mini

Current lower-cost GPT-5.4 production model.

$0.75 $4.50 400K
GPT-5.4 Nano

Current high-volume GPT-5.4 routing model.

$0.2 $1.25 400K
GPT-5.3-Codex

Current dedicated Codex API model for long-horizon agentic coding.

$1.75 $14.00 400K

Anthropic

4 models Verified 2026-06-14
Model Input $/M Output $/M Context
Claude Opus 4.8

Latest generally usable Opus model. Full 1M context at standard pricing. Fast mode costs $10/$50 per million tokens and remains a research preview.

$5.00 $25.00 1.0M
Claude Opus 4.6

Previous-generation Opus model. Full 1M context is available at standard pricing.

$5.00 $25.00 1.0M
Claude Sonnet 4.6

Current balanced Claude production model. Full 1M context is available at standard pricing.

$3.00 $15.00 1.0M
Claude Haiku 4.5

Fastest Claude model. 200K context, 64K max output.

$1.00 $5.00 200K

Google

6 models Verified 2026-06-14
Model Input $/M Output $/M Context
Gemini 3.5 Flash

Stable Gemini 3.5 Flash model for agentic loops, coding cycles, long-horizon tasks, search grounding, Batch API, Flex, context caching, and multimodal inputs.

$1.50 $9.00 1.0M
Gemini 3.1 Pro Preview

Preview Gemini Pro model. Inputs above 200K tokens use higher long-context pricing.

$2.00 $12.00 1.0M
Gemini 3.1 Flash-Lite

Fastest Gemini model. Optimized for high-throughput, multimodal tasks.

$0.25 $1.50 1.0M
Gemini 2.5 Pro

Long context >200K: $2.50 input, $15.00 output per 1M.

$1.25 $10.00 1.0M
Gemini 2.5 Flash $0.3 $2.50 1.0M
Gemini 2.5 Flash-Lite $0.1 $0.4 1.0M
Model Input $/M Output $/M Context
Grok 4.3

Current general-purpose xAI API model. Older Grok 3 and Grok 4 aliases route to the current model family.

$1.25 $2.50 1.0M
Grok Build 0.1

Current xAI model optimized for coding and software-building workflows.

$1.00 $2.00 256K

Mistral

3 models Verified 2026-06-14
Model Input $/M Output $/M Context
Mistral Large 3 $0.5 $1.50 128K
Mistral Medium 3.5

Current balanced Mistral production model.

$1.50 $7.50 131K
Mistral Small 4

Current low-cost Mistral model for high-volume production routing.

$0.1 $0.3 131K

DeepSeek

2 models Verified 2026-05-24
Model Input $/M Output $/M Context
DeepSeek V4 Flash

Current default V4 model. deepseek-chat and deepseek-reasoner are compatibility aliases for V4 Flash non-thinking/thinking modes and are scheduled for deprecation on 2026-07-24. Cache-hit input is $0.0028/M.

$0.14 $0.28 1.0M
DeepSeek V4 Pro

Current 75% off API price becomes the official 1/4-of-original price after 2026-05-31 15:59 UTC. Original reference price was $0.0145/M cached input, $1.74/M cache-miss input, and $3.48/M output.

$0.435 $0.87 1.0M

Xiaomi MiMo

2 models Verified 2026-05-29
Model Input $/M Output $/M Context
Xiaomi MiMo-V2.5-Pro

Open-sourced under MIT. New pay-as-you-go pricing took effect on 2026-05-27 00:00 Beijing time. Domestic pricing is ¥3.00/M cache-miss input, ¥0.025/M cached input, and ¥6.00/M output. Cache writing is currently free for a limited time.

$0.435 $0.87 1.0M
Xiaomi MiMo-V2.5

Native full-modal model with text, image, video, and audio understanding. New pay-as-you-go pricing took effect on 2026-05-27 00:00 Beijing time. Domestic pricing is ¥1.00/M cache-miss input, ¥0.02/M cached input, and ¥2.00/M output.

$0.14 $0.28 1.0M

MiniMax

1 models Verified 2026-06-14
Model Input $/M Output $/M Context
MiniMax M3

Latest MiniMax frontier coding and agent model. Supports Adaptive Thinking, tool use, and native image/video input. Inputs above 512K tokens use long-context pricing and currently require limited-access availability.

$0.3 $1.20 1.0M

Z.AI

3 models Verified 2026-06-14
Model Input $/M Output $/M Context
GLM-5.1

Latest generally documented GLM flagship API model for long-horizon coding and agent tasks. Z.AI says it can work autonomously on a single task for up to eight hours.

$1.40 $4.40 200K
GLM-5 Turbo

Current faster GLM model optimized for tool use and agent workflows.

$1.20 $4.00 200K
GLM-5V Turbo

Current multimodal GLM agent model for text, image, and video input.

$1.20 $4.00 200K

Moonshot AI

3 models Verified 2026-06-14
Model Input $/M Output $/M Context
Kimi K2.5

China API pricing in CNY. Current K2.5 multimodal model with text, image, and video input.

¥4.00 ¥21.00 262K
Kimi K2.6

Current general Kimi multimodal model. China API pricing in CNY.

¥6.50 ¥27.00 262K
Kimi K2.7 Code

Current Kimi coding model. Thinking-only mode; China API pricing in CNY.

¥6.50 ¥27.00 262K

Alibaba

2 models Verified 2026-06-14
Model Input $/M Output $/M Context
Qwen3.7 Plus

Current balanced Qwen agent model. China API pricing in CNY; inputs above 256K use long-context pricing.

¥2.00 ¥8.00 1.0M
Qwen3.7 Max

Current flagship Qwen API model. China API pricing in CNY.

¥12.00 ¥36.00 1.0M

Cohere

2 models Verified 2026-02-26
Model Input $/M Output $/M Context
Command R+

Enterprise RAG-optimized with citation grounding.

$2.50 $10.00 128K
Command R $0.15 $0.6 128K

AI21 Labs

2 models Verified 2026-02-26
Model Input $/M Output $/M Context
Jamba 1.5 Large

Mamba-based SSM architecture. 256K context.

$2.00 $8.00 256K
Jamba 1.5 Mini $0.2 $0.4 256K

Amazon

3 models Verified 2026-02-26
Model Input $/M Output $/M Context
Amazon Nova Pro

Via AWS Bedrock. 300K context.

$0.8 $3.20 300K
Amazon Nova Lite $0.06 $0.24 300K
Amazon Nova Micro

Text-only. Lowest cost option on Bedrock.

$0.035 $0.14 128K

Related Resources