Current Anthropic Claude API Pricing 2026: Opus, Sonnet & Haiku Costs
Current Anthropic Claude API pricing per 1M tokens for Opus 4.6, Sonnet 4.6, and Haiku 4.5. Includes prompt caching, Batch API discounts, 1M context, and monthly cost examples.
Anthropic Claude API pricing in 2026 is premium, but straightforward: Opus 4.6 costs $5/$25 per 1M input/output tokens, Sonnet 4.6 costs $3/$15, and Haiku 4.5 costs $1/$5 in DevTk.AI’s canonical model table. For coding agents and tool-heavy workflows, the real bill depends less on the headline model and more on context size, prompt caching, output length, and whether the task can run through Batch API.
Important freshness note: Anthropic’s official model docs now list Claude Opus 4.7 as the most capable generally available Opus model. Opus 4.7 has the same listed base API price as Opus 4.6 ($5/M input, $25/M output), but this guide keeps the model set aligned with src/data/models.ts until the canonical table is updated.
Quick Answer: Claude API Prices Per 1M Tokens
| Claude model | Input | Output | Cache hit | Batch input | Batch output | Best fit |
|---|---|---|---|---|---|---|
| Opus 4.6 | $5.00 | $25.00 | $0.50 | $2.50 | $12.50 | Highest-capability Claude work |
| Sonnet 4.6 | $3.00 | $15.00 | $0.30 | $1.50 | $7.50 | Default production coding and agent work |
| Haiku 4.5 | $1.00 | $5.00 | $0.10 | $0.50 | $2.50 | Fast, lower-cost routing |
For a coding-agent workload with 2M input tokens + 500K output tokens per month, the standard-price bill is about $22.50 on Opus 4.6, $13.50 on Sonnet 4.6, or $4.50 on Haiku 4.5 before prompt caching or Batch discounts. Compare against OpenAI and DeepSeek in the AI coding agent cost comparison, or check throughput constraints in the AI API rate limits guide.
Claude Pricing Table (May 2026)
| Model | Input Price | Output Price | Context | Max Output | Best For |
|---|---|---|---|---|---|
| Claude Opus 4.6 | $5.00/M | $25.00/M | 1M | 128K | Highest-capability Opus tier in the canonical table |
| Claude Sonnet 4.6 | $3.00/M | $15.00/M | 1M | 64K | Best Claude price-to-performance for production |
| Claude Haiku 4.5 | $1.00/M | $5.00/M | 200K | 64K | Fast, lower-cost Claude routing tier |
All prices are USD per 1 million tokens. Anthropic’s current pricing docs say Opus 4.6 and Sonnet 4.6 include the full 1M context window at standard pricing; a 900K-token request is billed at the same per-token rate as a 9K-token request. That replaces the older claim that 1M context was beta-only or subject to a separate long-context premium for these models.
Prompt Caching Prices
Anthropic prompt caching charges cache writes and cache reads separately:
| Model | Base Input | 5m Cache Write | 1h Cache Write | Cache Hit / Refresh |
|---|---|---|---|---|
| Opus 4.6 | $5.00/M | $6.25/M | $10.00/M | $0.50/M |
| Sonnet 4.6 | $3.00/M | $3.75/M | $6.00/M | $0.30/M |
| Haiku 4.5 | $1.00/M | $1.25/M | $2.00/M | $0.10/M |
A cache hit costs 10% of the standard input price. Use prompt caching for repeated system prompts, tool schemas, long reference documents, and stable conversation history.
Batch API Prices
Anthropic’s Message Batches API gives a 50% discount on input and output tokens for asynchronous workloads:
| Model | Batch Input | Batch Output |
|---|---|---|
| Opus 4.6 | $2.50/M | $12.50/M |
| Sonnet 4.6 | $1.50/M | $7.50/M |
| Haiku 4.5 | $0.50/M | $2.50/M |
Batch processing and prompt caching can be combined, so high-volume jobs with repeated context should use both when latency is not critical.
Claude vs. GPT-5 vs. DeepSeek
| Model | Input | Output | Notes |
|---|---|---|---|
| Claude Opus 4.6 | $5.00 | $25.00 | Premium Claude tier in canonical data |
| Claude Sonnet 4.6 | $3.00 | $15.00 | Best Claude production default |
| GPT-5 | $1.25 | $10.00 | Lower input price, smaller canonical context than Opus/Sonnet 4.6 |
| Claude Haiku 4.5 | $1.00 | $5.00 | Fast Claude tier |
| DeepSeek V4 Flash | $0.14 | $0.28 | Lower-cost text alternative |
| GPT-5 Mini | $0.25 | $2.00 | Lower-cost OpenAI routing tier |
Key takeaways:
- Opus 4.6 costs more than GPT-5 and should be reserved for tasks that justify the premium.
- Sonnet 4.6 is the default Claude choice for most production apps that need Claude’s coding, tool-use, and instruction-following profile.
- Haiku 4.5 is the Claude cost-control tier, with the same $1/$5 base price and 64K max output reflected in canonical data.
- DeepSeek and smaller OpenAI models remain much cheaper for simple text tasks, so routing matters.
Monthly Cost Estimates
Light Usage (Solo Developer)
100K input + 50K output tokens per day:
| Claude Model | Monthly Cost |
|---|---|
| Opus 4.6 | $52.50 |
| Sonnet 4.6 | $31.50 |
| Haiku 4.5 | $10.50 |
| GPT-5 reference | $18.75 |
| DeepSeek V4 Flash reference | $0.84 |
Medium Usage (Startup)
1M input + 500K output tokens per day:
| Claude Model | Monthly Cost |
|---|---|
| Opus 4.6 | $525 |
| Sonnet 4.6 | $315 |
| Haiku 4.5 | $105 |
| GPT-5 reference | $187.50 |
| DeepSeek V4 Flash reference | $8.40 |
Heavy Usage (Production)
10M input + 5M output tokens per day:
| Claude Model | Monthly Cost |
|---|---|
| Opus 4.6 | $5,250 |
| Sonnet 4.6 | $3,150 |
| Haiku 4.5 | $1,050 |
| GPT-5 reference | $1,875 |
| DeepSeek V4 Flash reference | $84 |
Calculate your exact costs: Use our AI Model Pricing Calculator.
Cost-Saving Strategies for Claude
1. Use Prompt Caching
For Sonnet 4.6 with a 3,000-token stable system prompt and 10K requests/day:
- Without caching: 3,000 x 10,000 x 30 x $3/M = $2,700/month
- With cache hits: repeated cached tokens bill at $0.30/M after the cache write
- Best fit: long system prompts, reusable tool definitions, shared documents, and agent instructions
2. Use Tiered Model Routing
Route by task complexity:
- Simple Q&A, classification, extraction -> Haiku 4.5
- General chat, code generation, tool use -> Sonnet 4.6
- Hard research, difficult reasoning, premium workflows -> Opus 4.6
3. Use Batch API for Non-Urgent Work
Data processing, content generation, offline summarization, and evaluation jobs should use Batch API when they do not need real-time responses.
4. Optimize Token Usage
- Keep system prompts short and cacheable.
- Use structured outputs when you need compact JSON instead of prose.
- Set practical
max_tokensvalues to avoid expensive runaway output.
5. Treat Extended Thinking as a Budget Choice
Claude reasoning features can improve quality, but output and thinking tokens still affect cost. Enable deeper reasoning only for tasks where the quality gain is worth the extra token spend.
Which Claude Model Should You Choose?
| Your Need | Recommended Model | Monthly Budget (1M input + 500K output/day) |
|---|---|---|
| Highest Claude capability in canonical data | Opus 4.6 | $525 |
| Best Claude production balance | Sonnet 4.6 | $315 |
| High-volume Claude routing | Haiku 4.5 | $105 |
| Lowest-cost text alternative | DeepSeek V4 Flash | $8.40 |
Bottom Line
Claude pricing in 2026 is still premium, but the current official docs are more favorable than older March-era guidance in one important way: Opus 4.6 and Sonnet 4.6 now have 1M context at standard per-token pricing. For most teams, Sonnet 4.6 plus prompt caching is the practical default, Haiku 4.5 handles lower-cost routing, and Opus should be reserved for requests that justify the premium.
Official sources checked: Anthropic pricing and Anthropic models overview.
Related tools:
- AI Model Pricing Calculator — Compare model costs side by side
- AI Token Counter — Count tokens before you send requests
- AI Coding Agent Cost Comparison 2026 — Codex, Claude Code, Cursor, DeepSeek, and API routing costs
- DeepSeek API Pricing Guide 2026 — The budget alternative to Claude
- OpenAI API Pricing Guide 2026 — GPT pricing and batch discounts
- Google Gemini API Pricing Guide 2026 — Gemini pricing, free tier, and long-context notes
- Full AI API Pricing Comparison — All providers compared
Related Posts
MCP vs A2A: The Two Protocols Shaping the AI Agent Ecosystem (2026)
2026-03-01
Mistral API Pricing 2026: Small $0.20/M, Large $2/M + Free Tier
2026-02-24
Current OpenAI API Pricing 2026: GPT-5.5, GPT-5.4, GPT-4o & Codex Costs
2026-02-24