DevTk.AI
Mistral API PricingMistral AIMistral Large 3API CostsLLM Pricing

Mistral API Pricing 2026: Small 3.1 at $0.20/M, Large 3 at $2/M — Complete Guide

February 2026 updated — Mistral AI API pricing: Large 3 at $2/$6, Medium 3 at $1/$3, Small 3.1 at $0.20/$0.60 per 1M tokens. Europe's leading AI, free tier included. Compare with GPT-5, Claude, DeepSeek.

DevTk.AI 2026-02-24 Updated 2026-02-24

Mistral AI is the most important AI company most developers outside Europe have never seriously evaluated. Headquartered in Paris and founded by former Meta and DeepMind researchers, Mistral has built a three-tier model lineup that competes on price, performance, and something no US or Chinese provider can match: native GDPR compliance and EU data residency.

In early 2026, Mistral offers three commercial models — Large 3 for complex reasoning, Medium 3 for balanced workloads, and Small 3.1 for high-throughput budget tasks — plus a free tier for experimentation and a family of open-weight models that you can self-host with zero API costs. The pricing is aggressive: Large 3 at $2/$6 per million tokens has the cheapest output pricing of any mid-to-high tier model, and Small 3.1 at $0.20/$0.60 directly competes with Gemini Flash and GPT-4.1 Nano at the budget end.

This guide covers every Mistral model’s pricing, head-to-head comparisons with GPT-5, Claude, Gemini, and DeepSeek, monthly cost estimates across three usage tiers, and optimization strategies — with a particular focus on when Mistral’s unique advantages (EU compliance, cheap output, open-weight options) make it the right choice for your project.

Mistral 2026 Model Lineup and Pricing

Mistral offers a clean three-tier commercial lineup, each targeting a distinct use case and budget:

ModelInput PriceOutput PriceContextMax OutputCapabilities
Mistral Large 3$2.00/M$6.00/M128K8KText, Vision, Function Calling
Mistral Medium 3$1.00/M$3.00/M128K8KText, Function Calling
Mistral Small 3.1$0.20/M$0.60/M128K4KText, Function Calling

All prices in USD per 1 million tokens. Source: Mistral AI pricing. Last updated: February 2026.

What Makes This Lineup Stand Out

Mistral’s pricing has two distinctive features that set it apart from every other major provider:

  1. Large 3 has the cheapest output pricing in its tier. At $6.00 per million output tokens, it costs 40% less on output than GPT-5 ($10.00), 60% less than Claude Sonnet 4.5 ($15.00), and 25% less than o3 ($8.00). If your application generates long responses — code generation, content creation, detailed analysis — Large 3 saves significant money on every request.

  2. Small 3.1 at $0.20/$0.60 is one of the cheapest capable models on the market. It sits between Gemini 2.5 Flash ($0.15/$0.60) and DeepSeek V3.2 ($0.27/$1.10), offering a strong balance of price and capability for high-throughput tasks. At this price point, you can process millions of tokens per day without breaking the budget.

Specialized Models

Beyond the three commercial tiers, Mistral also offers specialized models:

  • Codestral — A code-focused model optimized for code generation, completion, and refactoring. Available through the Mistral API with dedicated pricing for code-heavy workloads.
  • Pixtral — Mistral’s dedicated vision model for image understanding and analysis tasks. Useful when you need multimodal capabilities without paying for Large 3.

Free Tier and Open-Weight Models

Mistral is one of only two major providers (alongside Google) that offers a free API tier for experimentation.

Free Tier

Mistral provides free access to their models through the La Plateforme API with rate-limited usage:

FeatureFree Tier
AccessAll commercial models
Rate LimitsLower RPM than paid plans
Use CasePrototyping, evaluation, learning
Credit CardNot required to start

This is meaningful because neither OpenAI nor Anthropic offer free API access. OpenAI provides a small initial credit that expires, and Anthropic requires payment from day one. Mistral lets you build and test against their full model lineup before committing any budget.

Open-Weight Models

Mistral’s open-weight models are a major differentiator. These models can be downloaded and self-hosted with no API costs at all:

ModelParametersLicenseBest For
Mistral 7B7BApache 2.0Edge deployment, low-resource environments
Mixtral 8x7B46.7B (MoE)Apache 2.0Self-hosted general tasks, high throughput
Mixtral 8x22B141B (MoE)Apache 2.0Self-hosted complex reasoning

If you have GPU infrastructure (or access to affordable cloud GPUs), self-hosting Mixtral eliminates per-token API costs entirely. A Mixtral 8x7B deployment on a single A100 GPU can handle significant throughput at a fixed compute cost — often cheaper than API pricing at scale.

For developers who need full control over their model deployment — custom fine-tuning, on-premise requirements, air-gapped environments — Mistral’s open-weight strategy is a genuine advantage that OpenAI, Anthropic, and Google cannot match (Meta’s Llama is the main competitor here).

Mistral vs GPT-5 vs Claude vs Gemini vs DeepSeek: Price Comparison

Here is how Mistral’s three tiers compare against every major competitor:

Flagship and Mid-Tier Comparison

ModelProviderInputOutputvs Mistral Large 3 (Output)
Mistral Large 3Mistral$2.00$6.00Baseline
GPT-5OpenAI$1.25$10.001.67x more on output
Gemini 2.5 ProGoogle$1.25$10.001.67x more on output
Claude Sonnet 4.5Anthropic$3.00$15.002.5x more on output
o3OpenAI$2.00$8.001.33x more on output
Grok 3xAI$3.00$15.002.5x more on output
Claude Opus 4.5Anthropic$5.00$25.004.2x more on output

Key insight: Mistral Large 3’s output price of $6.00/M is the lowest among all models in the mid-to-high capability tier. If your application is output-heavy — generating long code blocks, writing articles, producing detailed reports — Large 3 can save 40-60% on output costs compared to GPT-5 or Claude Sonnet.

Budget Tier Comparison

ModelProviderInputOutputvs Mistral Small 3.1
Mistral Small 3.1Mistral$0.20$0.60Baseline
Gemini 2.5 FlashGoogle$0.15$0.6025% cheaper on input, same output
DeepSeek V3.2DeepSeek$0.27$1.1035% more on input, 83% more on output
GPT-4.1 NanoOpenAI$0.10$0.4050% cheaper on both
Claude Haiku 4.5Anthropic$1.00$5.005x / 8.3x more expensive
GPT-5 MiniOpenAI$0.25$2.0025% / 233% more expensive

Key insight: Mistral Small 3.1 sits in a competitive sweet spot. It is cheaper than DeepSeek V3.2 on both input and output, slightly more expensive than Gemini Flash on input (but with function calling support), and dramatically cheaper than Claude Haiku. For developers who need a budget model with solid text capabilities, Small 3.1 is a strong contender.

Monthly Cost Estimates

Here is what Mistral costs at three common usage levels, with reference prices from major competitors.

Solo Developer

100K input + 50K output tokens per day

ModelMonthly Cost
Mistral Small 3.1$1.50
Gemini 2.5 Flash$0.90
DeepSeek V3.2$2.46
Mistral Medium 3$7.50
Mistral Large 3$15.00
GPT-5$18.75
Claude Sonnet 4.5$31.50

Calculation: (100K x 30 / 1M) x input price + (50K x 30 / 1M) x output price

At solo developer scale, Mistral Small 3.1 costs just $1.50/month — less than a cup of coffee. Even Large 3 at $15.00 is cheaper than GPT-5 ($18.75) thanks to its lower output pricing.

Startup

1M input + 500K output tokens per day

ModelMonthly Cost
Mistral Small 3.1$15.00
Gemini 2.5 Flash$9.00
DeepSeek V3.2$24.60
Mistral Medium 3$75.00
Mistral Large 3$150.00
GPT-5$187.50
Claude Sonnet 4.5$315.00

At startup scale, the cost hierarchy becomes clear. Mistral Large 3 at $150/month is 20% cheaper than GPT-5 ($187.50) and 52% cheaper than Claude Sonnet ($315.00) — a meaningful difference that compounds over time. Small 3.1 at $15/month is remarkably cheap for production-grade text processing.

Enterprise / Production

10M input + 5M output tokens per day

ModelMonthly Cost
Mistral Small 3.1$150
Gemini 2.5 Flash$90
DeepSeek V3.2$246
Mistral Medium 3$750
Mistral Large 3$1,500
GPT-5$1,875
Claude Sonnet 4.5$3,150

At enterprise scale, Mistral Large 3 saves $375/month vs GPT-5 and $1,650/month vs Claude Sonnet 4.5. Over a year, that is $4,500 saved against GPT-5 and nearly $20,000 saved against Claude — without considering the additional savings from Mistral’s lower output pricing on output-heavy workloads.

Calculate your exact costs: Use our AI Model Pricing Calculator.

When to Choose Mistral Over Other Providers

Mistral is not the right choice for every use case, but there are several scenarios where it is clearly the optimal pick.

1. EU Data Residency and GDPR Compliance

This is Mistral’s single biggest advantage for European developers and companies. As a French company, Mistral processes all API data within the EU by default. There is no need to negotiate data processing agreements, worry about transatlantic data transfers, or evaluate Privacy Shield successors.

For EU-based companies subject to GDPR, using Mistral eliminates an entire category of compliance overhead. Compare this to:

  • OpenAI — US-based, data processed in the US (though EU hosting options exist via Azure)
  • Anthropic — US-based, data processed in the US
  • Google — US-based, Vertex AI offers EU regions but requires careful configuration
  • DeepSeek — China-based, significant compliance concerns for EU data

If your legal team is involved in your AI vendor selection, Mistral simplifies the conversation dramatically.

2. Output-Heavy Workloads (Code Generation, Content Creation)

Any application where the output is significantly longer than the input benefits from Mistral Large 3’s cheap output pricing:

Example: Code generation (500-token prompt, 2000-token output)

ModelCost per RequestMonthly (100K requests)
Mistral Large 3$0.013$1,300
GPT-5$0.021$2,063
Claude Sonnet 4.5$0.032$3,150

Mistral Large 3 saves 37% vs GPT-5 and 59% vs Claude Sonnet on output-heavy workloads. For code assistants, content generators, and report builders, this adds up fast.

3. Budget-Conscious High-Volume Processing

Mistral Small 3.1 at $0.20/$0.60 is ideal for:

  • Document classification and tagging
  • Content moderation at scale
  • Data extraction from structured text
  • Simple chatbot responses
  • Email categorization and routing

At these prices, you can process 10 million tokens per day for just $150/month.

4. Self-Hosted and Open-Weight Requirements

If your organization requires on-premise deployment, air-gapped operation, or full model weight access for fine-tuning, Mistral’s open-weight models (Mistral 7B, Mixtral 8x7B, Mixtral 8x22B) are among the best options available. They are released under Apache 2.0, which allows commercial use with no restrictions.

5. Fine-Tuning

Mistral offers fine-tuning through their API, allowing you to customize Small and Medium models for your specific domain. Fine-tuned smaller models often outperform general-purpose large models on narrow tasks at a fraction of the cost — turning a $2.00/M Large 3 workload into a $0.20/M fine-tuned Small 3.1 workload.

Getting Started with Mistral API

Step 1: Get Your API Key

  1. Visit console.mistral.ai
  2. Create an account (free tier available)
  3. Navigate to the API Keys section
  4. Generate a new API key

Step 2: Make Your First Request

Mistral uses an OpenAI-compatible API format, so if you already use the OpenAI SDK, switching is straightforward.

Python:

from openai import OpenAI

client = OpenAI(
    api_key="your-mistral-api-key",
    base_url="https://api.mistral.ai/v1"
)

response = client.chat.completions.create(
    model="mistral-large-latest",  # Large 3
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Write a Python function to validate EU VAT numbers"}
    ],
    temperature=0.7,
    max_tokens=2000
)

print(response.choices[0].message.content)

JavaScript / TypeScript:

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'your-mistral-api-key',
  baseURL: 'https://api.mistral.ai/v1',
});

const response = await client.chat.completions.create({
  model: 'mistral-large-latest',
  messages: [
    { role: 'system', content: 'You are a helpful coding assistant.' },
    { role: 'user', content: 'Write a React component with TypeScript for a data table' },
  ],
});

console.log(response.choices[0].message.content);

Using Mistral Small 3.1 for budget tasks:

response = client.chat.completions.create(
    model="mistral-small-latest",  # Small 3.1
    messages=[
        {"role": "user", "content": "Classify this text as positive, negative, or neutral: 'The API is fast but documentation could be better'"}
    ],
    max_tokens=50
)

Using Mistral Medium 3 for balanced workloads:

response = client.chat.completions.create(
    model="mistral-medium-latest",  # Medium 3
    messages=[
        {"role": "system", "content": "You are a senior software engineer."},
        {"role": "user", "content": "Review this pull request and suggest improvements..."}
    ],
    max_tokens=4000
)

Step 3: Migrate from OpenAI

If you are currently using OpenAI, migration takes two changes:

  1. Change base_url to https://api.mistral.ai/v1
  2. Replace api_key with your Mistral API key
  3. Set model to mistral-large-latest, mistral-medium-latest, or mistral-small-latest
  4. Message format, parameters, and streaming all work identically

Using Function Calling

All three Mistral commercial models support function calling (tool use), which is essential for building agentic applications:

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "City name"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="mistral-large-latest",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=tools,
    tool_choice="auto"
)

Cost Optimization Tips

1. Use Tiered Model Routing

The single most impactful optimization: route requests to the cheapest model that can handle them.

  • Simple classification, extraction, formatting — Small 3.1 ($0.20/$0.60)
  • General chat, code review, summarization — Medium 3 ($1.00/$3.00)
  • Complex reasoning, multi-step analysis, creative writing — Large 3 ($2.00/$6.00)

A typical 60/25/15 split (Small/Medium/Large) instead of all-Large can reduce your overall API costs by 70-80%.

2. Leverage Cheap Output Pricing for Generation Tasks

If your application generates long outputs (code, articles, reports), Mistral Large 3’s $6.00/M output pricing is your biggest advantage. Structure your prompts to be concise (minimize input tokens) and let the model generate detailed output (where Mistral’s pricing advantage is strongest).

3. Consider Self-Hosting for Predictable Workloads

If your monthly API spend on Mistral exceeds $500-1,000/month and your workload is predictable, evaluate self-hosting Mixtral 8x7B or 8x22B. A dedicated GPU instance often costs less than API pricing at scale, with the added benefits of no rate limits and full data control.

4. Fine-Tune Small Models for Repetitive Tasks

If you are making thousands of similar API calls (email classification, content tagging, sentiment analysis), fine-tune Mistral Small on your domain data. A fine-tuned Small 3.1 at $0.20/M can match or exceed a general-purpose Large 3 at $2.00/M on narrow tasks — a 10x cost reduction.

5. Optimize Prompt Length

Every 1,000 tokens trimmed from your system prompt saves money on every request. At Large 3 pricing with 10K daily requests:

  • 1,000 extra tokens x 10,000 requests x 30 days x $2.00/M = $0.60/month
  • The same at Small 3.1: $0.06/month

The savings are modest per-prompt but compound at scale, especially when combined with model routing.

6. Count Tokens Before Sending

Use our AI Token Counter to estimate token counts before making API calls. This helps you predict costs accurately and avoid surprise bills from oversized inputs.

Mistral Strengths and Limitations

Strengths

  • EU data residency by default — GDPR compliance without extra configuration
  • Cheapest output pricing in the mid-tier ($6.00/M on Large 3 vs $10-15/M for competitors)
  • Open-weight models — Mistral 7B and Mixtral available for self-hosting under Apache 2.0
  • OpenAI-compatible API — minimal migration friction
  • Free tier — prototype without entering a credit card
  • Function calling on all three commercial models
  • Fine-tuning API — customize Small and Medium for domain-specific tasks
  • Three clean tiers — easy to understand pricing without context-based multipliers

Limitations

  • 128K context window — half of Claude’s 200K, one-eighth of Gemini’s 1M
  • 8K max output on Large and Medium (4K on Small) — competitors offer 64K+
  • No prompt caching — Anthropic and OpenAI offer 50-90% discounts on repeated system prompts; Mistral does not
  • Smaller ecosystem — fewer third-party integrations, community tools, and tutorials compared to OpenAI or Anthropic
  • Limited multimodal — only Large 3 supports vision; no audio or video input
  • Benchmark gap — Large 3 generally trails GPT-5 and Claude Sonnet 4.5 on the hardest benchmarks (complex math, advanced reasoning)

Mistral vs. Competitors: When to Pick What

Use CaseBest ChoiceWhy
EU compliance requiredMistralNative GDPR, French hosting
Output-heavy generationMistral Large 3$6/M output — cheapest in tier
Budget high-volume processingMistral Small 3.1$0.20/$0.60 is hard to beat
Self-hosted / on-premiseMistral (Mixtral)Apache 2.0, no API costs
Maximum reasoning qualityClaude Opus 4.5 / GPT-5Still ahead on hardest tasks
Long context (>128K)Gemini 2.5 Pro (1M)Mistral caps at 128K
Cheapest possible priceGemini 2.5 Flash ($0.15)Slightly cheaper than Small 3.1
Vision + audio + videoGPT-5 / Gemini 2.5Broader multimodal support
Lowest cost with strong qualityDeepSeek V3.2 ($0.27/$1.10)Best quality-per-dollar for text
Free tier for prototypingGemini / MistralBoth offer no-cost access

Bottom Line

Mistral occupies a unique position in the 2026 AI API market. It is not the cheapest (Gemini Flash and DeepSeek are cheaper), not the most capable (GPT-5 and Claude Opus lead on benchmarks), and not the largest ecosystem (OpenAI dominates). But it excels in three areas where no single competitor matches it: EU data residency, best-in-class output pricing, and a full open-weight model family.

For EU-based companies, Mistral should be your first evaluation — the compliance advantage alone can save weeks of legal review. For output-heavy applications, Large 3 at $6.00/M output is 40-60% cheaper than GPT-5 or Claude Sonnet on every generated token. And for developers who want full model control, Mixtral under Apache 2.0 offers capabilities that closed-source providers simply cannot.

The recommended strategy: start with Small 3.1 for high-volume tasks, use Medium 3 for general workloads, and reserve Large 3 for complex reasoning and generation where its cheap output pricing delivers the most savings. If your volume justifies it, evaluate self-hosting Mixtral to eliminate per-token costs entirely.

Related tools and guides: