Mistral API Pricing 2026: Small $0.20/M, Large $2/M + Free Tier

Mistral AI is the most important AI company most developers outside Europe have never seriously evaluated. Headquartered in Paris and founded by former Meta and DeepMind researchers, Mistral has built a three-tier model lineup that competes on price, performance, and something no US or Chinese provider can match: native GDPR compliance and EU data residency.

In early 2026, Mistral offers three commercial models — Large 3 for complex reasoning, Medium 3 for balanced workloads, and Small 3.1 for high-throughput budget tasks — plus a free tier for experimentation and a family of open-weight models that you can self-host with zero API costs. The pricing is aggressive: Large 3 at $2/$6 per million tokens has the cheapest output pricing of any mid-to-high tier model, and Small 3.1 at $0.20/$0.60 directly competes with Gemini Flash and GPT-4.1 Nano at the budget end.

This guide covers every Mistral model’s pricing, head-to-head comparisons with GPT-5, Claude, Gemini, and DeepSeek, monthly cost estimates across three usage tiers, and optimization strategies — with a particular focus on when Mistral’s unique advantages (EU compliance, cheap output, open-weight options) make it the right choice for your project.

Mistral 2026 Model Lineup and Pricing

Mistral offers a clean three-tier commercial lineup, each targeting a distinct use case and budget:

Model	Input Price	Output Price	Context	Max Output	Capabilities
Mistral Large 3	$2.00/M	$6.00/M	128K	8K	Text, Vision, Function Calling
Mistral Medium 3	$1.00/M	$3.00/M	128K	8K	Text, Function Calling
Mistral Small 3.1	$0.20/M	$0.60/M	128K	4K	Text, Function Calling

All prices in USD per 1 million tokens. Source: Mistral AI pricing. Last updated: February 2026.

What Makes This Lineup Stand Out

Mistral’s pricing has two distinctive features that set it apart from every other major provider:

Large 3 has the cheapest output pricing in its tier. At $6.00 per million output tokens, it costs 40% less on output than GPT-5 ($10.00), 60% less than Claude Sonnet 4.5 ($15.00), and 25% less than o3 ($8.00). If your application generates long responses — code generation, content creation, detailed analysis — Large 3 saves significant money on every request.
Small 3.1 at $0.20/$0.60 is one of the cheapest capable models on the market. It sits between Gemini 2.5 Flash ($0.15/$0.60) and DeepSeek V3.2 ($0.27/$1.10), offering a strong balance of price and capability for high-throughput tasks. At this price point, you can process millions of tokens per day without breaking the budget.

Specialized Models

Beyond the three commercial tiers, Mistral also offers specialized models:

Codestral — A code-focused model optimized for code generation, completion, and refactoring. Available through the Mistral API with dedicated pricing for code-heavy workloads.
Pixtral — Mistral’s dedicated vision model for image understanding and analysis tasks. Useful when you need multimodal capabilities without paying for Large 3.

Free Tier and Open-Weight Models

Mistral is one of only two major providers (alongside Google) that offers a free API tier for experimentation.

Free Tier

Mistral provides free access to their models through the La Plateforme API with rate-limited usage:

Feature	Free Tier
Access	All commercial models
Rate Limits	Lower RPM than paid plans
Use Case	Prototyping, evaluation, learning
Credit Card	Not required to start

This is meaningful because neither OpenAI nor Anthropic offer free API access. OpenAI provides a small initial credit that expires, and Anthropic requires payment from day one. Mistral lets you build and test against their full model lineup before committing any budget.

Open-Weight Models

Mistral’s open-weight models are a major differentiator. These models can be downloaded and self-hosted with no API costs at all:

Model	Parameters	License	Best For
Mistral 7B	7B	Apache 2.0	Edge deployment, low-resource environments
Mixtral 8x7B	46.7B (MoE)	Apache 2.0	Self-hosted general tasks, high throughput
Mixtral 8x22B	141B (MoE)	Apache 2.0	Self-hosted complex reasoning

If you have GPU infrastructure (or access to affordable cloud GPUs), self-hosting Mixtral eliminates per-token API costs entirely. A Mixtral 8x7B deployment on a single A100 GPU can handle significant throughput at a fixed compute cost — often cheaper than API pricing at scale.

For developers who need full control over their model deployment — custom fine-tuning, on-premise requirements, air-gapped environments — Mistral’s open-weight strategy is a genuine advantage that OpenAI, Anthropic, and Google cannot match (Meta’s Llama is the main competitor here).

Mistral vs GPT-5 vs Claude vs Gemini vs DeepSeek: Price Comparison

Here is how Mistral’s three tiers compare against every major competitor:

Flagship and Mid-Tier Comparison

Model	Provider	Input	Output	vs Mistral Large 3 (Output)
Mistral Large 3	Mistral	$2.00	$6.00	Baseline
GPT-5	OpenAI	$1.25	$10.00	1.67x more on output
Gemini 2.5 Pro	Google	$1.25	$10.00	1.67x more on output
Claude Sonnet 4.5	Anthropic	$3.00	$15.00	2.5x more on output
o3	OpenAI	$2.00	$8.00	1.33x more on output
Grok 3	xAI	$3.00	$15.00	2.5x more on output
Claude Opus 4.5	Anthropic	$5.00	$25.00	4.2x more on output

Key insight: Mistral Large 3’s output price of $6.00/M is the lowest among all models in the mid-to-high capability tier. If your application is output-heavy — generating long code blocks, writing articles, producing detailed reports — Large 3 can save 40-60% on output costs compared to GPT-5 or Claude Sonnet.

Budget Tier Comparison

Model	Provider	Input	Output	vs Mistral Small 3.1
Mistral Small 3.1	Mistral	$0.20	$0.60	Baseline
Gemini 2.5 Flash	Google	$0.15	$0.60	25% cheaper on input, same output
DeepSeek V3.2	DeepSeek	$0.27	$1.10	35% more on input, 83% more on output
GPT-4.1 Nano	OpenAI	$0.10	$0.40	50% cheaper on both
Claude Haiku 4.5	Anthropic	$1.00	$5.00	5x / 8.3x more expensive
GPT-5 Mini	OpenAI	$0.25	$2.00	25% / 233% more expensive

Key insight: Mistral Small 3.1 sits in a competitive sweet spot. It is cheaper than DeepSeek V3.2 on both input and output, slightly more expensive than Gemini Flash on input (but with function calling support), and dramatically cheaper than Claude Haiku. For developers who need a budget model with solid text capabilities, Small 3.1 is a strong contender.

Monthly Cost Estimates

Here is what Mistral costs at three common usage levels, with reference prices from major competitors.

Solo Developer

100K input + 50K output tokens per day

Model	Monthly Cost
Mistral Small 3.1	$1.50
Gemini 2.5 Flash	$0.90
DeepSeek V3.2	$2.46
Mistral Medium 3	$7.50
Mistral Large 3	$15.00
GPT-5	$18.75
Claude Sonnet 4.5	$31.50

Calculation: (100K x 30 / 1M) x input price + (50K x 30 / 1M) x output price

At solo developer scale, Mistral Small 3.1 costs just $1.50/month — less than a cup of coffee. Even Large 3 at $15.00 is cheaper than GPT-5 ($18.75) thanks to its lower output pricing.

Startup

1M input + 500K output tokens per day

Model	Monthly Cost
Mistral Small 3.1	$15.00
Gemini 2.5 Flash	$9.00
DeepSeek V3.2	$24.60
Mistral Medium 3	$75.00
Mistral Large 3	$150.00
GPT-5	$187.50
Claude Sonnet 4.5	$315.00

At startup scale, the cost hierarchy becomes clear. Mistral Large 3 at $150/month is 20% cheaper than GPT-5 ($187.50) and 52% cheaper than Claude Sonnet ($315.00) — a meaningful difference that compounds over time. Small 3.1 at $15/month is remarkably cheap for production-grade text processing.

Enterprise / Production

10M input + 5M output tokens per day

Model	Monthly Cost
Mistral Small 3.1	$150
Gemini 2.5 Flash	$90
DeepSeek V3.2	$246
Mistral Medium 3	$750
Mistral Large 3	$1,500
GPT-5	$1,875
Claude Sonnet 4.5	$3,150

At enterprise scale, Mistral Large 3 saves $375/month vs GPT-5 and $1,650/month vs Claude Sonnet 4.5. Over a year, that is $4,500 saved against GPT-5 and nearly $20,000 saved against Claude — without considering the additional savings from Mistral’s lower output pricing on output-heavy workloads.

Calculate your exact costs: Use our AI Model Pricing Calculator.

When to Choose Mistral Over Other Providers

Mistral is not the right choice for every use case, but there are several scenarios where it is clearly the optimal pick.

This is Mistral’s single biggest advantage for European developers and companies. As a French company, Mistral processes all API data within the EU by default. There is no need to negotiate data processing agreements, worry about transatlantic data transfers, or evaluate Privacy Shield successors.

For EU-based companies subject to GDPR, using Mistral eliminates an entire category of compliance overhead. Compare this to:

OpenAI — US-based, data processed in the US (though EU hosting options exist via Azure)
Anthropic — US-based, data processed in the US
Google — US-based, Vertex AI offers EU regions but requires careful configuration
DeepSeek — China-based, significant compliance concerns for EU data

If your legal team is involved in your AI vendor selection, Mistral simplifies the conversation dramatically.

2. Output-Heavy Workloads (Code Generation, Content Creation)

Any application where the output is significantly longer than the input benefits from Mistral Large 3’s cheap output pricing:

Example: Code generation (500-token prompt, 2000-token output)

Model	Cost per Request	Monthly (100K requests)
Mistral Large 3	$0.013	$1,300
GPT-5	$0.021	$2,063
Claude Sonnet 4.5	$0.032	$3,150

Mistral Large 3 saves 37% vs GPT-5 and 59% vs Claude Sonnet on output-heavy workloads. For code assistants, content generators, and report builders, this adds up fast.

3. Budget-Conscious High-Volume Processing

Mistral Small 3.1 at $0.20/$0.60 is ideal for:

Document classification and tagging
Content moderation at scale
Data extraction from structured text
Simple chatbot responses
Email categorization and routing

At these prices, you can process 10 million tokens per day for just $150/month.

4. Self-Hosted and Open-Weight Requirements

If your organization requires on-premise deployment, air-gapped operation, or full model weight access for fine-tuning, Mistral’s open-weight models (Mistral 7B, Mixtral 8x7B, Mixtral 8x22B) are among the best options available. They are released under Apache 2.0, which allows commercial use with no restrictions.

5. Fine-Tuning

Mistral offers fine-tuning through their API, allowing you to customize Small and Medium models for your specific domain. Fine-tuned smaller models often outperform general-purpose large models on narrow tasks at a fraction of the cost — turning a $2.00/M Large 3 workload into a $0.20/M fine-tuned Small 3.1 workload.

Getting Started with Mistral API

Step 1: Get Your API Key

Visit console.mistral.ai
Create an account (free tier available)
Navigate to the API Keys section
Generate a new API key

Step 2: Make Your First Request

Mistral uses an OpenAI-compatible API format, so if you already use the OpenAI SDK, switching is straightforward.

Python:

from openai import OpenAI

client = OpenAI(
    api_key="your-mistral-api-key",
    base_url="https://api.mistral.ai/v1"
)

response = client.chat.completions.create(
    model="mistral-large-latest",  # Large 3
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Write a Python function to validate EU VAT numbers"}
    ],
    temperature=0.7,
    max_tokens=2000
)

print(response.choices[0].message.content)

JavaScript / TypeScript:

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'your-mistral-api-key',
  baseURL: 'https://api.mistral.ai/v1',
});

const response = await client.chat.completions.create({
  model: 'mistral-large-latest',
  messages: [
    { role: 'system', content: 'You are a helpful coding assistant.' },
    { role: 'user', content: 'Write a React component with TypeScript for a data table' },
  ],
});

console.log(response.choices[0].message.content);

Using Mistral Small 3.1 for budget tasks:

response = client.chat.completions.create(
    model="mistral-small-latest",  # Small 3.1
    messages=[
        {"role": "user", "content": "Classify this text as positive, negative, or neutral: 'The API is fast but documentation could be better'"}
    ],
    max_tokens=50
)

Using Mistral Medium 3 for balanced workloads:

response = client.chat.completions.create(
    model="mistral-medium-latest",  # Medium 3
    messages=[
        {"role": "system", "content": "You are a senior software engineer."},
        {"role": "user", "content": "Review this pull request and suggest improvements..."}
    ],
    max_tokens=4000
)

Step 3: Migrate from OpenAI

If you are currently using OpenAI, migration takes two changes:

Change base_url to https://api.mistral.ai/v1
Replace api_key with your Mistral API key
Set model to mistral-large-latest, mistral-medium-latest, or mistral-small-latest
Message format, parameters, and streaming all work identically

Using Function Calling

All three Mistral commercial models support function calling (tool use), which is essential for building agentic applications:

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "City name"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="mistral-large-latest",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=tools,
    tool_choice="auto"
)

Cost Optimization Tips

1. Use Tiered Model Routing

The single most impactful optimization: route requests to the cheapest model that can handle them.

Simple classification, extraction, formatting — Small 3.1 ($0.20/$0.60)
General chat, code review, summarization — Medium 3 ($1.00/$3.00)
Complex reasoning, multi-step analysis, creative writing — Large 3 ($2.00/$6.00)

A typical 60/25/15 split (Small/Medium/Large) instead of all-Large can reduce your overall API costs by 70-80%.

2. Leverage Cheap Output Pricing for Generation Tasks

If your application generates long outputs (code, articles, reports), Mistral Large 3’s $6.00/M output pricing is your biggest advantage. Structure your prompts to be concise (minimize input tokens) and let the model generate detailed output (where Mistral’s pricing advantage is strongest).

3. Consider Self-Hosting for Predictable Workloads

If your monthly API spend on Mistral exceeds $500-1,000/month and your workload is predictable, evaluate self-hosting Mixtral 8x7B or 8x22B. A dedicated GPU instance often costs less than API pricing at scale, with the added benefits of no rate limits and full data control.

4. Fine-Tune Small Models for Repetitive Tasks

If you are making thousands of similar API calls (email classification, content tagging, sentiment analysis), fine-tune Mistral Small on your domain data. A fine-tuned Small 3.1 at $0.20/M can match or exceed a general-purpose Large 3 at $2.00/M on narrow tasks — a 10x cost reduction.

5. Optimize Prompt Length

Every 1,000 tokens trimmed from your system prompt saves money on every request. At Large 3 pricing with 10K daily requests:

1,000 extra tokens x 10,000 requests x 30 days x $2.00/M = $0.60/month
The same at Small 3.1: $0.06/month

The savings are modest per-prompt but compound at scale, especially when combined with model routing.

6. Count Tokens Before Sending

Use our AI Token Counter to estimate token counts before making API calls. This helps you predict costs accurately and avoid surprise bills from oversized inputs.

Mistral Strengths and Limitations

Strengths

EU data residency by default — GDPR compliance without extra configuration
Cheapest output pricing in the mid-tier ($6.00/M on Large 3 vs $10-15/M for competitors)
Open-weight models — Mistral 7B and Mixtral available for self-hosting under Apache 2.0
OpenAI-compatible API — minimal migration friction
Free tier — prototype without entering a credit card
Function calling on all three commercial models
Fine-tuning API — customize Small and Medium for domain-specific tasks
Three clean tiers — easy to understand pricing without context-based multipliers

Limitations

128K context window — half of Claude’s 200K, one-eighth of Gemini’s 1M
8K max output on Large and Medium (4K on Small) — competitors offer 64K+
No prompt caching — Anthropic and OpenAI offer 50-90% discounts on repeated system prompts; Mistral does not
Smaller ecosystem — fewer third-party integrations, community tools, and tutorials compared to OpenAI or Anthropic
Limited multimodal — only Large 3 supports vision; no audio or video input
Benchmark gap — Large 3 generally trails GPT-5 and Claude Sonnet 4.5 on the hardest benchmarks (complex math, advanced reasoning)

Mistral vs. Competitors: When to Pick What

Use Case	Best Choice	Why
EU compliance required	Mistral	Native GDPR, French hosting
Output-heavy generation	Mistral Large 3	$6/M output — cheapest in tier
Budget high-volume processing	Mistral Small 3.1	$0.20/$0.60 is hard to beat
Self-hosted / on-premise	Mistral (Mixtral)	Apache 2.0, no API costs
Maximum reasoning quality	Claude Opus 4.5 / GPT-5	Still ahead on hardest tasks
Long context (>128K)	Gemini 2.5 Pro (1M)	Mistral caps at 128K
Cheapest possible price	Gemini 2.5 Flash ($0.15)	Slightly cheaper than Small 3.1
Vision + audio + video	GPT-5 / Gemini 2.5	Broader multimodal support
Lowest cost with strong quality	DeepSeek V3.2 ($0.27/$1.10)	Best quality-per-dollar for text
Free tier for prototyping	Gemini / Mistral	Both offer no-cost access

Bottom Line

Mistral occupies a unique position in the 2026 AI API market. It is not the cheapest (Gemini Flash and DeepSeek are cheaper), not the most capable (GPT-5 and Claude Opus lead on benchmarks), and not the largest ecosystem (OpenAI dominates). But it excels in three areas where no single competitor matches it: EU data residency, best-in-class output pricing, and a full open-weight model family.

For EU-based companies, Mistral should be your first evaluation — the compliance advantage alone can save weeks of legal review. For output-heavy applications, Large 3 at $6.00/M output is 40-60% cheaper than GPT-5 or Claude Sonnet on every generated token. And for developers who want full model control, Mixtral under Apache 2.0 offers capabilities that closed-source providers simply cannot.

The recommended strategy: start with Small 3.1 for high-volume tasks, use Medium 3 for general workloads, and reserve Large 3 for complex reasoning and generation where its cheap output pricing delivers the most savings. If your volume justifies it, evaluate self-hosting Mixtral to eliminate per-token costs entirely.

Related tools and guides:

AI Model Pricing Calculator — Compare monthly costs across 25+ models interactively
AI Token Counter — Count tokens accurately before making API calls
AI API Pricing Comparison 2026 — Full pricing table for all 7 major providers
DeepSeek API Pricing Guide 2026 — The budget king at $0.27/M input
Claude API Pricing Guide 2026 — Anthropic’s premium tier with prompt caching
OpenAI API Pricing Guide 2026 — GPT-5, GPT-4.1, o3 pricing and batch discounts
Google Gemini API Pricing Guide 2026 — Gemini 2.5 Pro/Flash pricing, free tier, 1M context
Gemini 3.1 Pro Pricing Guide — $2.00/M, 77.1% ARC-AGI-2, 1M context
GPT-5.3 Codex Pricing Guide — $2/M, agentic coding, 200K context, 32K output

Mistral API Pricing 2026: Small $0.20/M, Large $2/M + Free Tier

Mistral 2026 Model Lineup and Pricing

What Makes This Lineup Stand Out

Specialized Models

Free Tier and Open-Weight Models

Free Tier

Open-Weight Models

Mistral vs GPT-5 vs Claude vs Gemini vs DeepSeek: Price Comparison

Flagship and Mid-Tier Comparison

Budget Tier Comparison

Monthly Cost Estimates

Solo Developer

Startup

Enterprise / Production

When to Choose Mistral Over Other Providers

2. Output-Heavy Workloads (Code Generation, Content Creation)

3. Budget-Conscious High-Volume Processing

4. Self-Hosted and Open-Weight Requirements

5. Fine-Tuning

Getting Started with Mistral API

Step 1: Get Your API Key

Step 2: Make Your First Request

Step 3: Migrate from OpenAI

Using Function Calling

Cost Optimization Tips

1. Use Tiered Model Routing

2. Leverage Cheap Output Pricing for Generation Tasks

3. Consider Self-Hosting for Predictable Workloads

4. Fine-Tune Small Models for Repetitive Tasks

5. Optimize Prompt Length

6. Count Tokens Before Sending

Mistral Strengths and Limitations

Strengths

Limitations

Mistral vs. Competitors: When to Pick What

Bottom Line

Related Posts

OpenAI API Pricing (March 2026): GPT-5, GPT-4.1, GPT-4o, o3 Per-Token Costs

Grok API Pricing 2026: Grok 3 $3/M, Mini $0.30/M + $25 Free

DeepSeek API Pricing (March 2026): V3.2 $0.27/M — 80% Cheaper Than GPT-5

Mistral 2026 Model Lineup and Pricing

What Makes This Lineup Stand Out

Specialized Models

Free Tier and Open-Weight Models

Free Tier

Open-Weight Models

Mistral vs GPT-5 vs Claude vs Gemini vs DeepSeek: Price Comparison

Flagship and Mid-Tier Comparison

Budget Tier Comparison

Monthly Cost Estimates

Solo Developer

Startup

Enterprise / Production

When to Choose Mistral Over Other Providers

1. EU Data Residency and GDPR Compliance

2. Output-Heavy Workloads (Code Generation, Content Creation)

3. Budget-Conscious High-Volume Processing

4. Self-Hosted and Open-Weight Requirements

5. Fine-Tuning

Getting Started with Mistral API

Step 1: Get Your API Key

Step 2: Make Your First Request

Step 3: Migrate from OpenAI

Using Function Calling

Cost Optimization Tips

1. Use Tiered Model Routing

2. Leverage Cheap Output Pricing for Generation Tasks

3. Consider Self-Hosting for Predictable Workloads

4. Fine-Tune Small Models for Repetitive Tasks

5. Optimize Prompt Length

6. Count Tokens Before Sending

Mistral Strengths and Limitations

Strengths

Limitations

Mistral vs. Competitors: When to Pick What

Bottom Line

Related Posts

OpenAI API Pricing (March 2026): GPT-5, GPT-4.1, GPT-4o, o3 Per-Token Costs

Grok API Pricing 2026: Grok 3 $3/M, Mini $0.30/M + $25 Free

DeepSeek API Pricing (March 2026): V3.2 $0.27/M — 80% Cheaper Than GPT-5