What capabilities does Qwen 2.5 72B have?

Qwen 2.5 72B supports: text, function_calling. Open-source. Competitive with Llama 3.3 70B.

Qwen 2.5 72B

Q: What is Qwen 2.5 72B's context window?

Qwen 2.5 72B supports a context window of 128K tokens with a maximum output of 8K tokens per response.

Alibaba

Updated May 2026. Qwen 2.5 72B by Alibaba: $0.4/M cache-miss input, $1.20/M output tokens. 128K context, 8K max output. Function Calling. Free calculator + compare 40+ models.

Input Price

$0.4

cache miss / 1M tokens

Output Price

$1.20

per 1M tokens

Context Window

128K

tokens

Specifications

Provider	Alibaba
Model ID	qwen-2-5-72b
Input Price	$0.4 / 1M cache-miss tokens
Output Price	$1.2 / 1M tokens
Context Window	128K tokens
Max Output	8K tokens
Capabilities	textfunction_calling
Release Date	2025-09
Pricing Source	Official Alibaba pricing
Price Verified	2026-02-26 · DashScope model pricing should be refreshed before new pricing content.
Notes	Open-source. Competitive with Llama 3.3 70B.

Monthly Cost Estimates

Estimated monthly costs based on different daily usage levels (assuming 50% input / 50% output split). Input estimates use cache-miss pricing, so cache-heavy workloads can be lower.

Daily Tokens	Monthly Cost	Annual Cost
10K	$0.24	$2.88
50K	$1.20	$14.40
100K	$2.40	$28.80
500K	$12.00	$144.00
1.0M	$24.00	$288.00

About Qwen 2.5 72B

Qwen 2.5 72B is a large language model by Alibaba. It features a 128K token context window with up to 8K tokens of output per request. The model supports 2 capabilities: text, function_calling.

At $0.4 per million cache-miss input tokens and $1.2 per million output tokens, Qwen 2.5 72B is positioned as a cost-effective option in the Alibaba lineup. Use our Token Counter to estimate how many tokens your prompts use, and our Pricing Calculator to compare costs across all models.

Qwen 2.5 72B Key Details

Pricing: $0.4/M cache-miss input tokens, $1.2/M output tokens
Context window: 128K tokens — good for standard conversations and tasks
Max output: 8K tokens per response
Capabilities: text, function_calling
Highlights: Open-source. Competitive with Llama 3.3 70B.
Released: 2025-09

Other Alibaba Models

Qwen 2.5 Coder 32B

$0.2 / $0.6 per 1M

Similar Price Range

DeepSeek V4 Pro

DeepSeek

$0.435 / $0.87 per 1M · cached input $0.003625

Xiaomi MiMo-V2.5-Pro

Xiaomi MiMo

$0.435 / $0.87 per 1M · cached input $0.0036

Gemini 2.5 Flash

Google

$0.3 / $2.5 per 1M

Related Tools

AI Token Counter

Count tokens for Qwen 2.5 72B

Pricing Calculator

Compare all model prices

Throughput Planner

Plan RPM, TPM, and monthly cost for Qwen 2.5 72B

FAQ

How much does Qwen 2.5 72B cost?

Qwen 2.5 72B costs $0.4 per million cache-miss input tokens and $1.2 per million output tokens. For a typical workload of 100K input tokens/day and 50K output tokens/day, expect approximately $3.00/month before cache-hit savings.

What is Qwen 2.5 72B's context window?

Qwen 2.5 72B supports a context window of 128K tokens. This means your combined input prompt and output response can be up to 128K tokens. The maximum output per response is 8K tokens.

Is Qwen 2.5 72B good for my use case?

Qwen 2.5 72B supports text, function_calling. As a budget-friendly model, it works well for high-volume tasks like classification, summarization, and simple generation. Use our Pricing Calculator to compare with alternatives.