AI 模型定价目录
浏览 40+ AI 模型定价,包括 GPT-5、Claude、Gemini、DeepSeek、Llama 等。对比输入/输出价格、上下文窗口和能力。
最后更新:2026-05-06 · 来自 13 家提供商的 48 个模型
6 家提供商需要重新核验:Mistral、Meta、Alibaba Cloud、Cohere、AI21 Labs、AWS Bedrock。
48
模型
13
提供商
$0.035
最低输入/M
2.0M
最大上下文
需要估算成本?
使用我们的定价计算器,交互式对比各模型并估算每月费用。
| 模型 | 输入 $/M | 输出 $/M | 上下文 |
|---|---|---|---|
| GPT-5.5 Current OpenAI frontier model for complex reasoning, coding, and professional work. GPT-5.5 is also available in Codex; prompts above 272K input tokens use higher long-context pricing. | $5.00 | $30.00 | 1.1M |
| GPT-5.5 Pro Higher-compute GPT-5.5 variant for the hardest professional tasks. No cached-input discount is listed on the official model page. | $30.00 | $180.00 | 1.1M |
| GPT-5.4 More affordable OpenAI model for coding and professional work. GPT-5.5 is now the frontier option. | $2.50 | $15.00 | 1.1M |
| GPT-5 | $1.25 | $10.00 | 400K |
| GPT-5 Mini | $0.25 | $2.00 | 400K |
| GPT-5 Nano | $0.05 | $0.4 | 128K |
| GPT-4o | $2.50 | $10.00 | 128K |
| GPT-4o mini | $0.15 | $0.6 | 128K |
| o3-pro Highest reasoning capability for elite tasks. | $20.00 | $80.00 | 200K |
| o3 Standard reasoning model. | $2.00 | $8.00 | 200K |
| o3-mini Fast reasoning model for coding. | $1.10 | $4.40 | 200K |
| o1 Legacy reasoning model. | $15.00 | $60.00 | 200K |
| GPT-5.2-Codex Current dedicated Codex API model for long-horizon agentic coding. GPT-5.5 is also available in the Codex product, but the separate Codex API model ID is gpt-5.2-codex. | $1.75 | $14.00 | 400K |
| 模型 | 输入 $/M | 输出 $/M | 上下文 |
|---|---|---|---|
| Claude Opus 4.6 Most capable model. 1M context window is currently beta for eligible orgs. 80.8% SWE-bench. | $5.00 | $25.00 | 1.0M |
| Claude Sonnet 4.6 1M context window is currently beta for eligible orgs. 79.6% SWE-bench. Best value flagship. | $3.00 | $15.00 | 1.0M |
| Claude Haiku 4.5 Fastest Claude model. 200K context, 64K max output. | $1.00 | $5.00 | 200K |
| 模型 | 输入 $/M | 输出 $/M | 上下文 |
|---|---|---|---|
| Gemini 3.1 Pro Flagship reasoning score among Google models. 2M+ token context. Pricing >200K: $4 input, $18 output. | $2.00 | $12.00 | 2.0M |
| Gemini 3.1 Flash-Lite Fastest Gemini model. Optimized for high-throughput, multimodal tasks. | $0.25 | $1.50 | 1.0M |
| Gemini 2.5 Pro Long context >200K: $2.50 input, $15.00 output per 1M. | $1.25 | $10.00 | 2.0M |
| Gemini 2.5 Flash | $0.3 | $2.50 | 1.0M |
| Gemini 2.5 Flash-Lite | $0.1 | $0.4 | 1.0M |
| Gemini 2.0 Flash Deprecated. Scheduled for shutdown on June 1, 2026. | $0.1 | $0.4 | 1.0M |
| 模型 | 输入 $/M | 输出 $/M | 上下文 |
|---|---|---|---|
| Grok 3 | $3.00 | $15.00 | 131K |
| Grok 3 Mini | $0.3 | $0.5 | 131K |
| Grok 4 xAI flagship reasoning model. Strong performance on benchmarks with 256K context. | $3.00 | $15.00 | 256K |
| Grok 4 Fast Fast reasoning variant of Grok 4 with 2M context window. Extremely cost-efficient at $0.20/M input. | $0.2 | $0.5 | 2.0M |
Meta (via providers)
2 个模型| 模型 | 输入 $/M | 输出 $/M | 上下文 |
|---|---|---|---|
| Llama 3.3 70B Open-source. Pricing varies by hosting provider. Shown: typical cloud API price. | $0.88 | $0.88 | 128K |
| Llama 3.1 405B Open-source. Pricing varies by hosting provider. | $3.50 | $3.50 | 128K |
| 模型 | 输入 $/M | 输出 $/M | 上下文 |
|---|---|---|---|
| Mistral Large 3 | $2.00 | $6.00 | 128K |
| Mistral Medium 3 | $1.00 | $3.00 | 128K |
| Mistral Small 3.1 | $0.2 | $0.6 | 128K |
| 模型 | 输入 $/M | 输出 $/M | 上下文 |
|---|---|---|---|
| DeepSeek V4 Flash Current default V4 model. deepseek-chat and deepseek-reasoner are compatibility aliases for V4 Flash non-thinking/thinking modes and are scheduled for deprecation on 2026-07-24. Cache-hit input is $0.0028/M. | $0.14 | $0.28 | 1.0M |
| DeepSeek V3.2 (legacy) Legacy model. Prefer DeepSeek V4 Flash for current API pricing and 1M context. | $0.27 | $1.10 | 128K |
| DeepSeek R1 (legacy) Legacy reasoning model. The current deepseek-reasoner alias maps to DeepSeek V4 Flash thinking mode. | $0.55 | $2.19 | 128K |
| DeepSeek V4 Pro 75% discounted pricing through 2026-05-31 15:59 UTC. List price is $1.74/M cache-miss input and $3.48/M output unless the discount is extended. Cache-hit input is $0.003625/M during the discount. | $0.435 | $0.87 | 1.0M |
| 模型 | 输入 $/M | 输出 $/M | 上下文 |
|---|---|---|---|
| Xiaomi MiMo-V2.5-Pro Open-sourced under MIT. Overseas API price shown for input up to 256K; 256K-1M input uses the long-context prices. Domestic pricing is ¥7/¥1.40/¥21 per 1M for miss/hit/output. | $1.00 | $3.00 | 1.0M |
| Xiaomi MiMo-V2.5 Native full-modal model with text, image, video, and audio understanding. Overseas API price shown for input up to 256K; 256K-1M input uses the long-context prices. | $0.4 | $2.00 | 1.0M |
| Xiaomi MiMo-V2.5-Flash Low-cost MiMo model for high-throughput text and coding tasks. Cache writing is currently free for a limited time. | $0.1 | $0.3 | 256K |
| 模型 | 输入 $/M | 输出 $/M | 上下文 |
|---|---|---|---|
| Kimi K2.5 76.8% SWE-bench. Strongest open-source model. Agent Swarm support. | $0.6 | $2.00 | 128K |
Alibaba
2 个模型| 模型 | 输入 $/M | 输出 $/M | 上下文 |
|---|---|---|---|
| Qwen 2.5 72B Open-source. Competitive with Llama 3.3 70B. | $0.4 | $1.20 | 128K |
| Qwen 2.5 Coder 32B Code-specialized. Open-source. | $0.2 | $0.6 | 128K |
| 模型 | 输入 $/M | 输出 $/M | 上下文 |
|---|---|---|---|
| Command R+ Enterprise RAG-optimized with citation grounding. | $2.50 | $10.00 | 128K |
| Command R | $0.15 | $0.6 | 128K |
| 模型 | 输入 $/M | 输出 $/M | 上下文 |
|---|---|---|---|
| Jamba 1.5 Large Mamba-based SSM architecture. 256K context. | $2.00 | $8.00 | 256K |
| Jamba 1.5 Mini | $0.2 | $0.4 | 256K |
Amazon
3 个模型| 模型 | 输入 $/M | 输出 $/M | 上下文 |
|---|---|---|---|
| Amazon Nova Pro Via AWS Bedrock. 300K context. | $0.8 | $3.20 | 300K |
| Amazon Nova Lite | $0.06 | $0.24 | 300K |
| Amazon Nova Micro Text-only. Lowest cost option on Bedrock. | $0.035 | $0.14 | 128K |