Current OpenAI API Pricing 2026: GPT-5.5, GPT-5.4, GPT-4o & Codex Costs
Current OpenAI API pricing per 1M tokens for GPT-5.5, GPT-5.4, GPT-5.2-Codex, GPT-5, GPT-4o, and o3. Includes cached input, Batch/Flex discounts, long-context pricing, and monthly cost examples.
OpenAI API pricing in 2026 is easiest to reason about per 1 million tokens. If you need the current OpenAI API price quickly: GPT-5.5 is $5/M input and $30/M output, GPT-5.4 is $2.50/M input and $15/M output, and GPT-5.2-Codex is $1.75/M input and $14/M output before caching, Batch, Flex, or long-context modifiers.
For developer workloads, the practical shortlist is now GPT-5.5 for the hardest coding and agent tasks, GPT-5.4 for lower-cost frontier work, GPT-5.2-Codex for dedicated coding-agent API flows, and cheaper GPT-5/GPT-4o/o3-family models for compatibility or specialized routing.
This guide focuses on facts that change often: model names, token pricing, context windows, processing modes, and rate-limit behavior. Always confirm availability in your OpenAI dashboard before migrating production traffic.
Quick Answer: OpenAI API Prices Per 1M Tokens
| Model | Input | Cached input | Output | Best fit |
|---|---|---|---|---|
| GPT-5.5 | $5.00 | $0.50 | $30.00 | Hard coding, agents, long-context professional work |
| GPT-5.4 | $2.50 | $0.25 | $15.00 | Frontier quality at lower cost |
| GPT-5.2-Codex | $1.75 | $0.175 | $14.00 | Dedicated Codex API coding-agent workloads |
| GPT-5 | $1.25 | - | $10.00 | Existing GPT-5 integrations |
| GPT-5 Mini | $0.25 | - | $2.00 | Cost-sensitive production routing |
| GPT-4o mini | $0.15 | - | $0.60 | Legacy budget multimodal work |
For a coding agent that uses 2M input tokens + 500K output tokens per month, the rough standard-price bill is about $25 on GPT-5.5, $12.50 on GPT-5.4, $10.50 on GPT-5.2-Codex, or $7.50 on GPT-5 before caching, Batch, or Flex discounts. Use the AI Model Pricing Calculator for your own token mix, or compare provider limits in the AI API rate limits guide.
OpenAI API Pricing Table (May 2026)
Prices are USD per 1 million tokens. For the latest GPT-5.5 and GPT-5.4 models, OpenAI lists separate short-context and long-context rates; long context starts above roughly 270K input tokens.
| Model | Standard input | Cached input | Standard output | Long-context input | Long-context output | Context | Max output | Best for |
|---|---|---|---|---|---|---|---|---|
| GPT-5.5 | $5.00 | $0.50 | $30.00 | $10.00 | $45.00 | 1M | 128K | Hard coding, agents, professional work |
| GPT-5.5 Pro | $30.00 | - | $180.00 | $60.00 | $270.00 | 1M | 128K | Highest-accuracy work |
| GPT-5.4 | $2.50 | $0.25 | $15.00 | $5.00 | $22.50 | 1M | 128K | Frontier quality at lower cost |
| GPT-5.4 mini | $0.75 | $0.075 | $4.50 | - | - | 400K | 128K | Lower-latency, lower-cost production |
| GPT-5.4 nano | $0.20 | $0.02 | $1.25 | - | - | See docs | See docs | High-volume simple work |
| GPT-5.2-Codex | $1.75 | $0.175 | $14.00 | - | - | 400K | 128K | Dedicated Codex API agent work |
Source: OpenAI API pricing and OpenAI models.
Compatibility Models In The Site Catalog
The DevTk.AI canonical model table also keeps these OpenAI families for existing integrations and historical comparison:
| Model | Input | Output | Context | Max output | Use when |
|---|---|---|---|---|---|
| GPT-5 | $1.25 | $10.00 | 400K | 128K | You need the older GPT-5 baseline already deployed |
| GPT-5 Mini | $0.25 | $2.00 | 400K | 16K | Cost-sensitive GPT-5-family workloads |
| GPT-5 Nano | $0.05 | $0.40 | 128K | 16K | Very high-volume routing and extraction |
| GPT-4o | $2.50 | $10.00 | 128K | 16K | Legacy multimodal integrations |
| GPT-4o mini | $0.15 | $0.60 | 128K | 16K | Legacy budget multimodal integrations |
| o3-pro | $20.00 | $80.00 | 200K | 100K | Highest-cost reasoning tasks |
| o3 | $2.00 | $8.00 | 200K | 100K | Standard reasoning tasks |
| o3-mini | $1.10 | $4.40 | 200K | 100K | Lower-cost reasoning |
These entries are maintained in src/data/models.ts; check OpenAI’s live pricing page before quoting them in contracts or customer-facing estimates.
Batch, Flex, Priority, And Data Residency
OpenAI now exposes pricing by processing mode for the latest GPT-5.5 and GPT-5.4 family:
| Mode | Pricing behavior | Use it for |
|---|---|---|
| Standard | Baseline published token rates | Interactive production requests |
| Batch | 50% of standard rates | Offline jobs that can wait for asynchronous processing |
| Flex | 50% of standard rates | Cost-sensitive work that can tolerate variable latency |
| Priority | 2.5x standard rates for listed models | Latency-sensitive production spikes |
| Data residency / regional processing | 10% uplift for listed GPT-5.5 and GPT-5.4 models | Workloads with regional processing requirements |
Batch and Flex cut GPT-5.5 standard short-context pricing from $5/$30 to $2.50/$15 per million input/output tokens. Priority raises GPT-5.5 short-context pricing to $12.50/$75.
Rate Limits And Usage Tiers
Do not hard-code public RPM or TPM tables into your planning. OpenAI states that rate limits are set at the organization and project level, vary by model, may be shared by model family, and can include separate limits for long-context requests.
OpenAI’s docs also distinguish rate limits from monthly usage limits. Your account can automatically graduate to higher usage tiers as API spend increases, but the exact limits for your organization should be read from the OpenAI dashboard.
Key planning rules:
- Track RPM, RPD, TPM, TPD, and where relevant IPM.
- Treat long-context workloads separately because they can have separate limits.
- Check shared-limit groups before assuming two related model IDs provide independent capacity.
- Monitor Batch queue limits; queued tokens count until the batch completes.
Source: OpenAI rate limits and usage tiers.
Monthly Cost Examples
These examples use the current GPT-5.5/GPT-5.4 published short-context standard rates unless marked Batch.
Solo Developer
3M input + 1.5M output tokens per month.
| Model | Monthly cost |
|---|---|
| GPT-5.4 nano | $2.48 |
| GPT-5.4 mini | $9.00 |
| GPT-5.4 | $30.00 |
| GPT-5.5 | $60.00 |
| GPT-5.5 Batch/Flex | $30.00 |
Startup Team
30M input + 15M output tokens per month.
| Model | Monthly cost |
|---|---|
| GPT-5.4 nano | $24.75 |
| GPT-5.4 mini | $90.00 |
| GPT-5.4 | $300.00 |
| GPT-5.5 | $600.00 |
| GPT-5.5 Batch/Flex | $300.00 |
Production Scale
300M input + 150M output tokens per month.
| Model | Monthly cost |
|---|---|
| GPT-5.4 nano | $247.50 |
| GPT-5.4 mini | $900.00 |
| GPT-5.4 | $3,000.00 |
| GPT-5.5 | $6,000.00 |
| GPT-5.5 Batch/Flex | $3,000.00 |
Long-context requests above the pricing threshold cost more for GPT-5.5, GPT-5.5 Pro, and GPT-5.4, so run your real prompts through a token counter before budgeting large document or repository workflows.
Want exact numbers for your usage pattern? Try our AI Model Pricing Calculator.
Which OpenAI Model Should You Choose?
GPT-5.5: Best For Hard Production Work
Start with GPT-5.5 when quality matters more than token price: coding agents, tool-heavy workflows, grounded assistants, long-context retrieval, product-spec-to-plan workflows, and customer-facing workflows where polished execution matters.
GPT-5.5 Pro: Highest Accuracy, Highest Cost
Use GPT-5.5 Pro only after evaluations show that the higher price produces enough quality lift. It is priced for the hardest professional tasks, not routine traffic.
GPT-5.4: Frontier Quality At A Lower Price
GPT-5.4 is the practical fallback when GPT-5.5 is too expensive but you still need a 1M context frontier model.
GPT-5.4 Mini And Nano: Default Routing Targets
Use GPT-5.4 mini for ordinary production requests that need good quality at lower latency and lower cost. Use GPT-5.4 nano for simple classification, extraction, tagging, routing, and formatting.
GPT-5, GPT-4o, And o3 Families: Compatibility And Specialized Routing
Keep existing GPT-5 or GPT-4o integrations if migration risk is higher than the expected savings. Route math, logic, and complex multi-step reasoning to the o3 family only when evals show it beats the general GPT-5 family on your task.
Getting Started With The Current API
Prefer the Responses API for new work unless you have an existing Chat Completions integration.
from openai import OpenAI
client = OpenAI()
response = client.responses.create(
model="gpt-5.5",
input="Review this API design and identify the highest-risk edge cases."
)
print(response.output_text)
For lower-cost routing:
response = client.responses.create(
model="gpt-5.4-mini",
input="Extract company names, dates, and dollar amounts as JSON."
)
Cost Optimization Checklist
- Route simple tasks to mini or nano models before using GPT-5.5.
- Use Batch or Flex for asynchronous workloads to cut token rates in half.
- Keep reusable instructions and reference material stable so prompt caching can apply.
- Treat long-context pricing as a separate budget line above the threshold.
- Set output limits and structured formats so runaway generations do not dominate cost.
- Read actual rate and usage limits from your OpenAI dashboard, not a static blog table.
Bottom Line
The old early-2026 framing is no longer the right starting point. In May 2026, the current OpenAI pricing conversation starts with GPT-5.5 for highest-quality work, GPT-5.4 for a lower-cost frontier option, and GPT-5.4 mini/nano for routed production traffic.
For most teams, the practical architecture is: GPT-5.4 mini or nano for routine work, GPT-5.4 for higher-quality long-context tasks, GPT-5.5 for the workflows where better execution changes the outcome, and Batch/Flex for every job that does not need an immediate response.
Official OpenAI references:
Related guides:
Related Posts
Mistral API Pricing 2026: Small $0.20/M, Large $2/M + Free Tier
2026-02-24
Grok API Pricing 2026: Grok 3 $3/M, Mini $0.30/M + $25 Free
2026-02-24
AI Coding Agent Cost Comparison 2026: Codex, Claude Code, Cursor, DeepSeek & GPT-5.5
2026-05-07