OpenAI API Pricing (March 2026): GPT-5, GPT-4.1, GPT-4o, o3 Per-Token Costs

OpenAI’s model lineup in February 2026 is the largest of any single provider. With GPT-5 as the flagship, GPT-4.1 for million-token context, the o3 family for reasoning, and budget options ranging from GPT-5 Mini down to GPT-4.1 Nano, there is an OpenAI model for virtually every use case and budget.

But navigating nine different models with different pricing tiers, batch discounts, and rate limits can be confusing. This guide breaks down every OpenAI API model’s pricing, compares them against Claude, Gemini, and DeepSeek, and shows you exactly how to minimize your monthly bill.

OpenAI API Pricing Table (February 2026)

All prices are in USD per 1 million tokens.

Flagship Models

Model	Input Price	Output Price	Context	Max Output	Best For
GPT-5	$1.25/M	$10.00/M	400K	64K	Best overall quality, multimodal
o3	$2.00/M	$8.00/M	200K	100K	Reasoning, math, complex logic

GPT-5 is OpenAI’s most capable general-purpose model. It handles text, images, and structured output with the highest quality across benchmarks. The o3 model is the reasoning specialist — it excels at math, logic, coding puzzles, and multi-step analysis tasks where chain-of-thought matters.

Mid-Tier Models

Model	Input Price	Output Price	Context	Max Output	Best For
GPT-4.1	$2.00/M	$8.00/M	1M	64K	Long documents, large codebases
GPT-4o	$2.50/M	$10.00/M	128K	16K	Previous-gen flagship, stable
o4-mini	$1.10/M	$4.40/M	200K	100K	Budget reasoning tasks

GPT-4.1 is the long-context champion with a full 1 million token context window — ideal for processing entire codebases, legal documents, or book-length content in a single request. GPT-4o remains available as the previous-generation flagship and is still a solid choice if you have existing integrations. The o4-mini offers reasoning capabilities at roughly half the cost of o3.

Budget Models

Model	Input Price	Output Price	Context	Max Output	Best For
GPT-5 Mini	$0.25/M	$2.00/M	400K	64K	Budget flagship, good quality
GPT-4.1 Mini	$0.40/M	$1.60/M	1M	64K	Budget long-context
GPT-4.1 Nano	$0.10/M	$0.40/M	1M	64K	Highest volume, lowest cost
GPT-4o Mini	$0.15/M	$0.60/M	128K	16K	Previous-gen budget option

All prices last updated: February 2026. Source: OpenAI pricing page.

The budget tier is where OpenAI really shines in 2026. GPT-5 Mini delivers surprisingly strong quality at just $0.25 per million input tokens — 5x cheaper than the full GPT-5. GPT-4.1 Nano is the cheapest model in the entire lineup at $0.10/$0.40, making it suitable for high-volume classification, extraction, and routing tasks where you need millions of calls per day.

OpenAI Free Tier and Rate Limits

OpenAI uses a tier-based system that determines your rate limits and access to models. Your tier automatically upgrades as you spend more on the platform.

Tier	Spend Requirement	RPM (GPT-5)	TPM (GPT-5)
Free	$0	3	40,000
Tier 1	$5	500	200,000
Tier 2	$50	5,000	2,000,000
Tier 3	$100	5,000	4,000,000
Tier 4	$250	10,000	10,000,000
Tier 5	$1,000	10,000	30,000,000

The free tier gives you access to GPT-5 Mini and GPT-4o Mini with very limited rate limits — enough for prototyping and experimentation but not production use. Starting at Tier 1 ($5 spend), you unlock all models including GPT-5 and o3.

Key points about rate limits:

RPM = requests per minute. Most production apps need Tier 2+ to avoid throttling.
TPM = tokens per minute. Long-context workloads with GPT-4.1 (1M context) can hit TPM limits quickly at lower tiers.
Rate limits are per-model, so using GPT-5 and o3 simultaneously gives you separate allocations for each.

Batch API Pricing — 50% Off

OpenAI’s Batch API lets you submit large sets of requests for asynchronous processing, typically completed within 24 hours. The tradeoff: you give up real-time responses in exchange for a 50% discount on all models.

Model	Standard Input	Batch Input	Standard Output	Batch Output
GPT-5	$1.25/M	$0.625/M	$10.00/M	$5.00/M
GPT-4.1	$2.00/M	$1.00/M	$8.00/M	$4.00/M
o3	$2.00/M	$1.00/M	$8.00/M	$4.00/M
GPT-5 Mini	$0.25/M	$0.125/M	$2.00/M	$1.00/M
GPT-4.1 Nano	$0.10/M	$0.05/M	$0.40/M	$0.20/M

Best use cases for Batch API:

Bulk content generation (product descriptions, summaries)
Large-scale data extraction and classification
Evaluation and testing across hundreds of prompts
Nightly data processing pipelines

At batch pricing, GPT-5 drops to $0.625/$5.00 per million tokens — making it cheaper than standard Claude Sonnet 4.5 pricing ($3.00/$15.00) while delivering comparable or better quality.

GPT-5 vs Claude vs Gemini vs DeepSeek: Price Comparison

How does OpenAI stack up against the competition? Here is a head-to-head comparison of flagship and popular models across all major providers.

Model	Provider	Input	Output	vs GPT-5
GPT-5	OpenAI	$1.25	$10.00	Baseline
Gemini 2.5 Pro	Google	$1.25	$10.00	Same price, 1M context
Claude Sonnet 4.5	Anthropic	$3.00	$15.00	2.4x input / 1.5x output
Claude Opus 4.5	Anthropic	$5.00	$25.00	4x input / 2.5x output
Grok 3	xAI	$3.00	$15.00	2.4x input / 1.5x output
DeepSeek V3.2	DeepSeek	$0.27	$1.10	4.6x cheaper input / 9.1x cheaper output
GPT-5 Mini	OpenAI	$0.25	$2.00	5x cheaper input / 5x cheaper output
Gemini 2.5 Flash	Google	$0.15	$0.60	8.3x cheaper input / 16.7x cheaper output

Key takeaways:

GPT-5 and Gemini 2.5 Pro are price-matched at $1.25/$10.00. Google’s advantage is a 1M context window vs. OpenAI’s 400K. OpenAI’s advantage is broader tool ecosystem integration and vision capabilities.
Claude models cost significantly more. Sonnet 4.5 costs 2.4x the input price of GPT-5 and 1.5x the output. Opus 4.5 is 4x more expensive on input. Claude’s edge is in nuanced instruction following and creative writing quality.
DeepSeek is the budget king at 4.6x cheaper than GPT-5 on input and 9.1x cheaper on output. If raw cost is your primary concern and you don’t need multimodal capabilities, DeepSeek V3.2 is hard to beat.
GPT-5 Mini offers the best value within the OpenAI ecosystem — 5x cheaper than GPT-5 with surprisingly strong quality for most tasks.

Monthly Cost Estimates

Here is what you can expect to pay monthly across three usage tiers, using OpenAI models vs. key competitors.

Solo Developer

100K input + 50K output tokens per day (3M input + 1.5M output per month)

Model	Monthly Cost
GPT-4.1 Nano	$0.90
GPT-5 Mini	$3.75
DeepSeek V3.2	$2.46
GPT-5	$18.75
GPT-4.1	$21.00
Claude Sonnet 4.5	$31.50

Math: GPT-5 = (3M x $1.25/M) + (1.5M x $10.00/M) = $3.75 + $15.00 = $18.75

Startup Team

1M input + 500K output tokens per day (30M input + 15M output per month)

Model	Monthly Cost
GPT-4.1 Nano	$9.00
GPT-5 Mini	$37.50
DeepSeek V3.2	$24.60
GPT-5	$187.50
GPT-4.1	$210.00
Claude Sonnet 4.5	$315.00

Enterprise / Production Scale

10M input + 5M output tokens per day (300M input + 150M output per month)

Model	Monthly Cost
GPT-4.1 Nano	$90
GPT-5 Mini	$375
DeepSeek V3.2	$246
GPT-5	$1,875
GPT-5 (Batch API)	$937.50
GPT-4.1	$2,100
Claude Sonnet 4.5	$3,150

At enterprise scale, the Batch API makes a massive difference — cutting GPT-5 costs from $1,875 to $937.50 per month for workloads that don’t need real-time responses.

Want exact numbers for your usage pattern? Try our AI Model Pricing Calculator.

When to Choose Which OpenAI Model

With nine models to choose from, here is a straightforward decision guide:

GPT-5 — Your Default Choice

Use GPT-5 when you need the best overall quality and don’t have extreme cost constraints. It handles text, vision, structured output, and function calling better than any other OpenAI model. If you are unsure which model to use, start here.

Best for: General chat, content generation, code generation, multimodal tasks, production applications where quality matters.

GPT-4.1 — When You Need 1M Context

GPT-4.1’s 1 million token context window is its defining feature. Choose it when your input data simply won’t fit in other models’ context windows.

Best for: Processing entire codebases, long legal documents, book-length content, large CSV/JSON datasets, multi-file analysis.

o3 — Reasoning and Math

The o3 model uses internal chain-of-thought reasoning to solve complex problems. It outperforms GPT-5 on math, logic, and scientific reasoning benchmarks.

Best for: Math problems, formal logic, scientific analysis, complex multi-step reasoning, competitive programming.

o4-mini — Budget Reasoning

When you need reasoning capabilities but o3 is too expensive, o4-mini delivers solid reasoning at roughly half the cost ($1.10/$4.40 vs. $2.00/$8.00).

Best for: Moderate reasoning tasks, math tutoring, code review, logic puzzles where cost matters.

GPT-5 Mini — Budget Flagship

GPT-5 Mini is the sweet spot for most cost-conscious developers. At $0.25/$2.00, it delivers 80-90% of GPT-5’s quality at 20% of the cost.

Best for: Chatbots, customer support, content generation at scale, any task where “good enough” quality saves significant money.

GPT-4.1 Mini — Budget Long-Context

Need to process long documents but GPT-4.1 is too expensive? GPT-4.1 Mini gives you the full 1M context window at $0.40/$1.60.

Best for: Summarizing large documents, extracting data from long inputs, budget-friendly RAG with large context.

GPT-4.1 Nano — Highest Volume, Lowest Cost

At $0.10/$0.40, GPT-4.1 Nano is designed for high-volume pipelines where you need millions of API calls per day. Quality is lower than Mini variants but sufficient for structured tasks.

Best for: Classification, entity extraction, routing, data labeling, sentiment analysis, any task with clear structure and simple outputs.

GPT-4o and GPT-4o Mini — Previous Generation

These models remain available for backwards compatibility. If you have production systems running on GPT-4o, there is no urgency to migrate — but new projects should generally start with GPT-5 or GPT-5 Mini for better price-performance.

Getting Started with OpenAI API

Step 1: Get Your API Key

Visit platform.openai.com
Create an account or sign in
Navigate to API Keys and generate a new key
Add credits to your account (minimum $5 to unlock Tier 1)

Step 2: Install the SDK

pip install openai

Step 3: Make Your First Request

Python:

from openai import OpenAI

client = OpenAI()  # Uses OPENAI_API_KEY env variable

response = client.chat.completions.create(
    model="gpt-5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)

print(response.choices[0].message.content)

Using GPT-4.1 for long-context:

# Process an entire codebase in one request
with open("codebase.txt") as f:
    code = f.read()

response = client.chat.completions.create(
    model="gpt-4.1",
    messages=[
        {"role": "system", "content": "You are a code review expert."},
        {"role": "user", "content": f"Review this codebase:\n\n{code}"}
    ],
    max_tokens=8000
)

Using o3 for reasoning:

response = client.chat.completions.create(
    model="o3",
    messages=[
        {"role": "user", "content": "Prove that the square root of 2 is irrational."}
    ]
)

JavaScript / TypeScript:

import OpenAI from 'openai';

const client = new OpenAI();

const response = await client.chat.completions.create({
  model: 'gpt-5',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Write a React hook for debouncing' },
  ],
});

console.log(response.choices[0].message.content);

Cost Optimization Tips

1. Model Routing

The single biggest cost-saving strategy is routing requests to different models based on complexity. Build a simple classifier (or use GPT-4.1 Nano as the router) to determine which model handles each request:

Simple queries (greetings, FAQs, classification) -> GPT-4.1 Nano ($0.10/$0.40)
Standard tasks (chat, code, content) -> GPT-5 Mini ($0.25/$2.00)
Complex tasks (research, analysis, creative) -> GPT-5 ($1.25/$10.00)
Reasoning-heavy (math, logic, proofs) -> o3 ($2.00/$8.00)

A typical 60/30/8/2 split across these tiers can reduce your average cost by 70% compared to sending everything to GPT-5.

2. Use the Batch API Aggressively

Any workload that can tolerate 24-hour latency should use the Batch API. At 50% off, batch GPT-5 ($0.625/$5.00) is cheaper than standard GPT-4o ($2.50/$10.00) on both input and output.

3. Prompt Caching

OpenAI supports automatic prompt caching for repeated system prompts and prefixes. If your system prompt is consistent across requests, the cached portion costs significantly less on subsequent calls.

4. Optimize Token Usage

Use our AI Token Counter to measure token counts before deploying prompts
Keep system prompts concise — every 1,000 unnecessary tokens costs $1.25/month per daily request at GPT-5 pricing
Use structured output (JSON mode) to get focused responses and avoid verbose filler text
Set max_tokens to prevent runaway output costs

5. Consider Alternatives for Simple Tasks

Not every request needs OpenAI. For high-volume, cost-sensitive workloads:

DeepSeek V3.2 at $0.27/$1.10 handles most general tasks at a fraction of the cost
Gemini 2.5 Flash at $0.15/$0.60 is even cheaper for simple classification and extraction
Self-hosted Llama 4 eliminates per-token costs entirely if you have GPU infrastructure

Bottom Line

OpenAI’s 2026 pricing strategy gives developers more options than ever. GPT-5 at $1.25/$10.00 is competitively priced against Gemini 2.5 Pro and significantly cheaper than Claude Sonnet 4.5. The budget tier — GPT-5 Mini at $0.25/$2.00 and GPT-4.1 Nano at $0.10/$0.40 — makes OpenAI accessible for high-volume production use cases that were previously cost-prohibitive.

For most developers, the optimal strategy is:

Start with GPT-5 Mini for development and testing
Upgrade to GPT-5 only for tasks where Mini’s quality falls short
Use GPT-4.1 when you need the 1M context window
Route reasoning tasks to o3 or o4-mini
Batch everything you can for the 50% discount

The models and pricing will continue to evolve, but one thing is clear: the cost of top-tier AI capabilities is dropping fast, and OpenAI is making sure it stays competitive.

Related guides:

AI API Pricing Comparison 2026 — Full pricing table for all 7 major providers
AI Model Pricing Calculator — Compare monthly costs across 40+ models
AI Token Counter — Count tokens accurately before API calls
DeepSeek API Pricing Guide 2026 — The cheapest capable AI model
Claude API Pricing Guide 2026 — Anthropic’s premium API breakdown
Grok API Pricing Guide 2026 — Grok 3 at $3/M, Mini at $0.30/M, $25 free credits
Mistral API Pricing Guide 2026 — Large 3 at $2/M, Small 3.1 at $0.20/M, EU GDPR compliant
Gemini 3.1 Pro Pricing Guide — $2.00/M, 77.1% ARC-AGI-2, 1M context
GPT-5.3 Codex Pricing Guide — $2/M, agentic coding, 200K context, 32K output