What capabilities does MiniMax M3 have?

MiniMax M3 supports: text, vision, video, function_calling. Latest MiniMax frontier coding and agent model. Supports Adaptive Thinking, tool use, and native image/video input. Inputs above 512K tokens use long-context pricing and currently require limited-access availability.

MiniMax M3

Q: What is MiniMax M3's context window?

MiniMax M3 supports a context window of 1.0M tokens. Its maximum output is 524K tokens per response.

MiniMax

Updated August 2026. MiniMax M3 by MiniMax: $0.3/M cache-miss input, $1.20/M output tokens. Cached input: $0.06/M. Explicit cache writes: $0.375/M. Long-context pricing above 512K input tokens: $0.6/M input, $2.40/M output. 1.0M context, 524K max output. Vision & Video. Free calculator + compare 40+ models.

Input Price

$0.3

cache miss / 1M tokens

Cached Input

$0.06

per 1M tokens

Output Price

$1.20

per 1M tokens

Context Window

1.0M

tokens

Specifications

Provider	MiniMax
Model ID	MiniMax-M3
Input Price	$0.3 / 1M cache-miss tokens
Cached Input Price	$0.06 / 1M tokens
Explicit Cache Write Price	$0.375 / 1M tokens
Output Price	$1.20 / 1M tokens
Long Context Threshold	512K input tokens
Long Context Pricing	$0.6 / 1M cache-miss input, $0.12 / 1M cached input, $0.375 / 1M explicit cache writes, $2.40 / 1M output
Context Window	1.0M tokens
Max Output	524K tokens
Capabilities	textvisionvideofunction_calling
Release Date	2026-06-01
Pricing Source	Official MiniMax pricing
Price Verified	2026-08-01 · MiniMax M3 international API prices include the current 50% token discount; long-context rates apply above 512K tokens.
Notes	Latest MiniMax frontier coding and agent model. Supports Adaptive Thinking, tool use, and native image/video input. Inputs above 512K tokens use long-context pricing and currently require limited-access availability.

Monthly Cost Estimates

Estimated monthly costs based on different daily usage levels (assuming 50% input / 50% output split). Input estimates use cache-miss pricing, so cache-heavy workloads can be lower.

Daily Tokens	Monthly Cost	Annual Cost
10K	$0.225	$2.70
50K	$1.13	$13.50
100K	$2.25	$27.00
500K	$11.25	$135.00
1.0M	$22.50	$270.00

About MiniMax M3

MiniMax M3 is a large language model by MiniMax. It features a 1.0M token context window with up to 524K tokens of output per request. The model supports 4 capabilities: text, vision, video, function_calling.

At $0.3 per million cache-miss input tokens and $1.20 per million output tokens, MiniMax M3 is positioned as a cost-effective option in the MiniMax lineup. Repeated prefix input can be charged at $0.06 per million cached tokens. Use our Token Counter to estimate how many tokens your prompts use, and our Pricing Calculator to compare costs across all models.

MiniMax M3 Key Details

Pricing: $0.3/M cache-miss input tokens, $0.06/M cached input tokens, $0.375/M explicit cache writes, $1.20/M output tokens
Context window: 1.0M tokens — one of the largest available
Max output: 524K tokens per response
Capabilities: text, vision, video, function_calling
Highlights: Latest MiniMax frontier coding and agent model. Supports Adaptive Thinking, tool use, and native image/video input. Inputs above 512K tokens use long-context pricing and currently require limited-access availability.
Released: 2026-06-01

Similar Price Range

Gemini 3.5 Flash-Lite

Google

$0.3 / $2.50 per 1M · cached input $0.03

Gemini 2.5 Flash

Google

$0.3 / $2.50 per 1M · cached input $0.03

Amazon Nova 2 Lite

Amazon

$0.3 / $2.50 per 1M

Related Tools

AI Token Counter

Count tokens for MiniMax M3

Pricing Calculator

Compare all model prices

Throughput Planner

Plan RPM, TPM, and monthly cost for MiniMax M3

FAQ

How much does MiniMax M3 cost?

MiniMax M3 costs $0.3 per million cache-miss input tokens and $1.20 per million output tokens. Cached input costs $0.06 per million tokens. For a typical workload of 100K input tokens/day and 50K output tokens/day, expect approximately $2.70/month before cache-hit savings.

What is MiniMax M3's context window?

MiniMax M3 supports a context window of 1.0M tokens. This means your combined input prompt and output response can be up to 1.0M tokens. The maximum output per response is 524K tokens.

Is MiniMax M3 good for my use case?

MiniMax M3 supports text, vision, video, function_calling. As a budget-friendly model, it works well for high-volume tasks like classification, summarization, and simple generation. Use our Pricing Calculator to compare with alternatives.