OpenAI API Pricing 2026: GPT-5 at $1.25/M, GPT-4.1 at $2/M — Complete Guide
Full OpenAI API pricing for 2026. GPT-5 at $1.25/$10, GPT-4.1 at $2/$8, GPT-4o at $2.50/$10, o3 at $2/$8 per 1M tokens. Includes free tier options, batch API discounts, and cost comparison with Claude and DeepSeek.
OpenAI’s model lineup in February 2026 is the largest of any single provider. With GPT-5 as the flagship, GPT-4.1 for million-token context, the o3 family for reasoning, and budget options ranging from GPT-5 Mini down to GPT-4.1 Nano, there is an OpenAI model for virtually every use case and budget.
But navigating nine different models with different pricing tiers, batch discounts, and rate limits can be confusing. This guide breaks down every OpenAI API model’s pricing, compares them against Claude, Gemini, and DeepSeek, and shows you exactly how to minimize your monthly bill.
OpenAI API Pricing Table (February 2026)
All prices are in USD per 1 million tokens.
Flagship Models
| Model | Input Price | Output Price | Context | Max Output | Best For |
|---|---|---|---|---|---|
| GPT-5 | $1.25/M | $10.00/M | 400K | 64K | Best overall quality, multimodal |
| o3 | $2.00/M | $8.00/M | 200K | 100K | Reasoning, math, complex logic |
GPT-5 is OpenAI’s most capable general-purpose model. It handles text, images, and structured output with the highest quality across benchmarks. The o3 model is the reasoning specialist — it excels at math, logic, coding puzzles, and multi-step analysis tasks where chain-of-thought matters.
Mid-Tier Models
| Model | Input Price | Output Price | Context | Max Output | Best For |
|---|---|---|---|---|---|
| GPT-4.1 | $2.00/M | $8.00/M | 1M | 64K | Long documents, large codebases |
| GPT-4o | $2.50/M | $10.00/M | 128K | 16K | Previous-gen flagship, stable |
| o4-mini | $1.10/M | $4.40/M | 200K | 100K | Budget reasoning tasks |
GPT-4.1 is the long-context champion with a full 1 million token context window — ideal for processing entire codebases, legal documents, or book-length content in a single request. GPT-4o remains available as the previous-generation flagship and is still a solid choice if you have existing integrations. The o4-mini offers reasoning capabilities at roughly half the cost of o3.
Budget Models
| Model | Input Price | Output Price | Context | Max Output | Best For |
|---|---|---|---|---|---|
| GPT-5 Mini | $0.25/M | $2.00/M | 400K | 64K | Budget flagship, good quality |
| GPT-4.1 Mini | $0.40/M | $1.60/M | 1M | 64K | Budget long-context |
| GPT-4.1 Nano | $0.10/M | $0.40/M | 1M | 64K | Highest volume, lowest cost |
| GPT-4o Mini | $0.15/M | $0.60/M | 128K | 16K | Previous-gen budget option |
All prices last updated: February 2026. Source: OpenAI pricing page.
The budget tier is where OpenAI really shines in 2026. GPT-5 Mini delivers surprisingly strong quality at just $0.25 per million input tokens — 5x cheaper than the full GPT-5. GPT-4.1 Nano is the cheapest model in the entire lineup at $0.10/$0.40, making it suitable for high-volume classification, extraction, and routing tasks where you need millions of calls per day.
OpenAI Free Tier and Rate Limits
OpenAI uses a tier-based system that determines your rate limits and access to models. Your tier automatically upgrades as you spend more on the platform.
| Tier | Spend Requirement | RPM (GPT-5) | TPM (GPT-5) |
|---|---|---|---|
| Free | $0 | 3 | 40,000 |
| Tier 1 | $5 | 500 | 200,000 |
| Tier 2 | $50 | 5,000 | 2,000,000 |
| Tier 3 | $100 | 5,000 | 4,000,000 |
| Tier 4 | $250 | 10,000 | 10,000,000 |
| Tier 5 | $1,000 | 10,000 | 30,000,000 |
The free tier gives you access to GPT-5 Mini and GPT-4o Mini with very limited rate limits — enough for prototyping and experimentation but not production use. Starting at Tier 1 ($5 spend), you unlock all models including GPT-5 and o3.
Key points about rate limits:
- RPM = requests per minute. Most production apps need Tier 2+ to avoid throttling.
- TPM = tokens per minute. Long-context workloads with GPT-4.1 (1M context) can hit TPM limits quickly at lower tiers.
- Rate limits are per-model, so using GPT-5 and o3 simultaneously gives you separate allocations for each.
Batch API Pricing — 50% Off
OpenAI’s Batch API lets you submit large sets of requests for asynchronous processing, typically completed within 24 hours. The tradeoff: you give up real-time responses in exchange for a 50% discount on all models.
| Model | Standard Input | Batch Input | Standard Output | Batch Output |
|---|---|---|---|---|
| GPT-5 | $1.25/M | $0.625/M | $10.00/M | $5.00/M |
| GPT-4.1 | $2.00/M | $1.00/M | $8.00/M | $4.00/M |
| o3 | $2.00/M | $1.00/M | $8.00/M | $4.00/M |
| GPT-5 Mini | $0.25/M | $0.125/M | $2.00/M | $1.00/M |
| GPT-4.1 Nano | $0.10/M | $0.05/M | $0.40/M | $0.20/M |
Best use cases for Batch API:
- Bulk content generation (product descriptions, summaries)
- Large-scale data extraction and classification
- Evaluation and testing across hundreds of prompts
- Nightly data processing pipelines
At batch pricing, GPT-5 drops to $0.625/$5.00 per million tokens — making it cheaper than standard Claude Sonnet 4.5 pricing ($3.00/$15.00) while delivering comparable or better quality.
GPT-5 vs Claude vs Gemini vs DeepSeek: Price Comparison
How does OpenAI stack up against the competition? Here is a head-to-head comparison of flagship and popular models across all major providers.
| Model | Provider | Input | Output | vs GPT-5 |
|---|---|---|---|---|
| GPT-5 | OpenAI | $1.25 | $10.00 | Baseline |
| Gemini 2.5 Pro | $1.25 | $10.00 | Same price, 1M context | |
| Claude Sonnet 4.5 | Anthropic | $3.00 | $15.00 | 2.4x input / 1.5x output |
| Claude Opus 4.5 | Anthropic | $5.00 | $25.00 | 4x input / 2.5x output |
| Grok 3 | xAI | $3.00 | $15.00 | 2.4x input / 1.5x output |
| DeepSeek V3.2 | DeepSeek | $0.27 | $1.10 | 4.6x cheaper input / 9.1x cheaper output |
| GPT-5 Mini | OpenAI | $0.25 | $2.00 | 5x cheaper input / 5x cheaper output |
| Gemini 2.5 Flash | $0.15 | $0.60 | 8.3x cheaper input / 16.7x cheaper output |
Key takeaways:
- GPT-5 and Gemini 2.5 Pro are price-matched at $1.25/$10.00. Google’s advantage is a 1M context window vs. OpenAI’s 400K. OpenAI’s advantage is broader tool ecosystem integration and vision capabilities.
- Claude models cost significantly more. Sonnet 4.5 costs 2.4x the input price of GPT-5 and 1.5x the output. Opus 4.5 is 4x more expensive on input. Claude’s edge is in nuanced instruction following and creative writing quality.
- DeepSeek is the budget king at 4.6x cheaper than GPT-5 on input and 9.1x cheaper on output. If raw cost is your primary concern and you don’t need multimodal capabilities, DeepSeek V3.2 is hard to beat.
- GPT-5 Mini offers the best value within the OpenAI ecosystem — 5x cheaper than GPT-5 with surprisingly strong quality for most tasks.
Monthly Cost Estimates
Here is what you can expect to pay monthly across three usage tiers, using OpenAI models vs. key competitors.
Solo Developer
100K input + 50K output tokens per day (3M input + 1.5M output per month)
| Model | Monthly Cost |
|---|---|
| GPT-4.1 Nano | $0.90 |
| GPT-5 Mini | $3.75 |
| DeepSeek V3.2 | $2.46 |
| GPT-5 | $18.75 |
| GPT-4.1 | $21.00 |
| Claude Sonnet 4.5 | $31.50 |
Math: GPT-5 = (3M x $1.25/M) + (1.5M x $10.00/M) = $3.75 + $15.00 = $18.75
Startup Team
1M input + 500K output tokens per day (30M input + 15M output per month)
| Model | Monthly Cost |
|---|---|
| GPT-4.1 Nano | $9.00 |
| GPT-5 Mini | $37.50 |
| DeepSeek V3.2 | $24.60 |
| GPT-5 | $187.50 |
| GPT-4.1 | $210.00 |
| Claude Sonnet 4.5 | $315.00 |
Enterprise / Production Scale
10M input + 5M output tokens per day (300M input + 150M output per month)
| Model | Monthly Cost |
|---|---|
| GPT-4.1 Nano | $90 |
| GPT-5 Mini | $375 |
| DeepSeek V3.2 | $246 |
| GPT-5 | $1,875 |
| GPT-5 (Batch API) | $937.50 |
| GPT-4.1 | $2,100 |
| Claude Sonnet 4.5 | $3,150 |
At enterprise scale, the Batch API makes a massive difference — cutting GPT-5 costs from $1,875 to $937.50 per month for workloads that don’t need real-time responses.
Want exact numbers for your usage pattern? Try our AI Model Pricing Calculator.
When to Choose Which OpenAI Model
With nine models to choose from, here is a straightforward decision guide:
GPT-5 — Your Default Choice
Use GPT-5 when you need the best overall quality and don’t have extreme cost constraints. It handles text, vision, structured output, and function calling better than any other OpenAI model. If you are unsure which model to use, start here.
Best for: General chat, content generation, code generation, multimodal tasks, production applications where quality matters.
GPT-4.1 — When You Need 1M Context
GPT-4.1’s 1 million token context window is its defining feature. Choose it when your input data simply won’t fit in other models’ context windows.
Best for: Processing entire codebases, long legal documents, book-length content, large CSV/JSON datasets, multi-file analysis.
o3 — Reasoning and Math
The o3 model uses internal chain-of-thought reasoning to solve complex problems. It outperforms GPT-5 on math, logic, and scientific reasoning benchmarks.
Best for: Math problems, formal logic, scientific analysis, complex multi-step reasoning, competitive programming.
o4-mini — Budget Reasoning
When you need reasoning capabilities but o3 is too expensive, o4-mini delivers solid reasoning at roughly half the cost ($1.10/$4.40 vs. $2.00/$8.00).
Best for: Moderate reasoning tasks, math tutoring, code review, logic puzzles where cost matters.
GPT-5 Mini — Budget Flagship
GPT-5 Mini is the sweet spot for most cost-conscious developers. At $0.25/$2.00, it delivers 80-90% of GPT-5’s quality at 20% of the cost.
Best for: Chatbots, customer support, content generation at scale, any task where “good enough” quality saves significant money.
GPT-4.1 Mini — Budget Long-Context
Need to process long documents but GPT-4.1 is too expensive? GPT-4.1 Mini gives you the full 1M context window at $0.40/$1.60.
Best for: Summarizing large documents, extracting data from long inputs, budget-friendly RAG with large context.
GPT-4.1 Nano — Highest Volume, Lowest Cost
At $0.10/$0.40, GPT-4.1 Nano is designed for high-volume pipelines where you need millions of API calls per day. Quality is lower than Mini variants but sufficient for structured tasks.
Best for: Classification, entity extraction, routing, data labeling, sentiment analysis, any task with clear structure and simple outputs.
GPT-4o and GPT-4o Mini — Previous Generation
These models remain available for backwards compatibility. If you have production systems running on GPT-4o, there is no urgency to migrate — but new projects should generally start with GPT-5 or GPT-5 Mini for better price-performance.
Getting Started with OpenAI API
Step 1: Get Your API Key
- Visit platform.openai.com
- Create an account or sign in
- Navigate to API Keys and generate a new key
- Add credits to your account (minimum $5 to unlock Tier 1)
Step 2: Install the SDK
pip install openai
Step 3: Make Your First Request
Python:
from openai import OpenAI
client = OpenAI() # Uses OPENAI_API_KEY env variable
response = client.chat.completions.create(
model="gpt-5",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)
Using GPT-4.1 for long-context:
# Process an entire codebase in one request
with open("codebase.txt") as f:
code = f.read()
response = client.chat.completions.create(
model="gpt-4.1",
messages=[
{"role": "system", "content": "You are a code review expert."},
{"role": "user", "content": f"Review this codebase:\n\n{code}"}
],
max_tokens=8000
)
Using o3 for reasoning:
response = client.chat.completions.create(
model="o3",
messages=[
{"role": "user", "content": "Prove that the square root of 2 is irrational."}
]
)
JavaScript / TypeScript:
import OpenAI from 'openai';
const client = new OpenAI();
const response = await client.chat.completions.create({
model: 'gpt-5',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Write a React hook for debouncing' },
],
});
console.log(response.choices[0].message.content);
Cost Optimization Tips
1. Model Routing
The single biggest cost-saving strategy is routing requests to different models based on complexity. Build a simple classifier (or use GPT-4.1 Nano as the router) to determine which model handles each request:
- Simple queries (greetings, FAQs, classification) -> GPT-4.1 Nano ($0.10/$0.40)
- Standard tasks (chat, code, content) -> GPT-5 Mini ($0.25/$2.00)
- Complex tasks (research, analysis, creative) -> GPT-5 ($1.25/$10.00)
- Reasoning-heavy (math, logic, proofs) -> o3 ($2.00/$8.00)
A typical 60/30/8/2 split across these tiers can reduce your average cost by 70% compared to sending everything to GPT-5.
2. Use the Batch API Aggressively
Any workload that can tolerate 24-hour latency should use the Batch API. At 50% off, batch GPT-5 ($0.625/$5.00) is cheaper than standard GPT-4o ($2.50/$10.00) on both input and output.
3. Prompt Caching
OpenAI supports automatic prompt caching for repeated system prompts and prefixes. If your system prompt is consistent across requests, the cached portion costs significantly less on subsequent calls.
4. Optimize Token Usage
- Use our AI Token Counter to measure token counts before deploying prompts
- Keep system prompts concise — every 1,000 unnecessary tokens costs $1.25/month per daily request at GPT-5 pricing
- Use structured output (JSON mode) to get focused responses and avoid verbose filler text
- Set
max_tokensto prevent runaway output costs
5. Consider Alternatives for Simple Tasks
Not every request needs OpenAI. For high-volume, cost-sensitive workloads:
- DeepSeek V3.2 at $0.27/$1.10 handles most general tasks at a fraction of the cost
- Gemini 2.5 Flash at $0.15/$0.60 is even cheaper for simple classification and extraction
- Self-hosted Llama 4 eliminates per-token costs entirely if you have GPU infrastructure
Bottom Line
OpenAI’s 2026 pricing strategy gives developers more options than ever. GPT-5 at $1.25/$10.00 is competitively priced against Gemini 2.5 Pro and significantly cheaper than Claude Sonnet 4.5. The budget tier — GPT-5 Mini at $0.25/$2.00 and GPT-4.1 Nano at $0.10/$0.40 — makes OpenAI accessible for high-volume production use cases that were previously cost-prohibitive.
For most developers, the optimal strategy is:
- Start with GPT-5 Mini for development and testing
- Upgrade to GPT-5 only for tasks where Mini’s quality falls short
- Use GPT-4.1 when you need the 1M context window
- Route reasoning tasks to o3 or o4-mini
- Batch everything you can for the 50% discount
The models and pricing will continue to evolve, but one thing is clear: the cost of top-tier AI capabilities is dropping fast, and OpenAI is making sure it stays competitive.
Related guides:
- AI API Pricing Comparison 2026 — Full pricing table for all 7 major providers
- AI Model Pricing Calculator — Compare monthly costs across 40+ models
- AI Token Counter — Count tokens accurately before API calls
- DeepSeek API Pricing Guide 2026 — The cheapest capable AI model
- Claude API Pricing Guide 2026 — Anthropic’s premium API breakdown
- Grok API Pricing Guide 2026 — Grok 3 at $3/M, Mini at $0.30/M, $25 free credits
- Mistral API Pricing Guide 2026 — Large 3 at $2/M, Small 3.1 at $0.20/M, EU GDPR compliant
- Gemini 3.1 Pro Pricing Guide — $1.25/M, 77.1% ARC-AGI-2, 1M context
- GPT-5.3 Codex Pricing Guide — $2/M, agentic coding, 200K context, 32K output