DevTk.AI
Chinese AI ModelsGLM-5.1MiniMax M3Qwen 3.7DeepSeek V4AI API Pricing

Chinese AI Models in 2026: GLM-5.1, MiniMax M3, Qwen 3.7, DeepSeek V4, and MiMo Pricing

Compare current Chinese AI model API pricing, context windows, and agent use cases across GLM-5.1, MiniMax M3, Qwen 3.7, DeepSeek V4, and Xiaomi MiMo.

DevTk.AI 2026-06-14 Updated 2026-06-14 3 min read

Chinese AI APIs are no longer one budget category. The current market includes ultra-cheap text routes, premium long-horizon coding models, native multimodal agents, and million-token context models.

The most useful question is not which model wins a vendor benchmark. It is which model completes your workload at the lowest total cost after output tokens, cached context, long-context premiums, latency, and retries.

Current Price Snapshot

USD prices below use official international API pricing. RMB prices remain in RMB because converting them would introduce exchange-rate drift.

ModelInput / 1MCached Input / 1MOutput / 1MContextBest Fit
DeepSeek V4 Flash$0.14$0.0028$0.281MLowest-cost text routing
Xiaomi MiMo-V2.5$0.14$0.0028$0.281MAffordable multimodal agents
MiniMax M3, up to 512K input$0.30$0.06$1.201MLong-context multimodal agents
GLM-5.1$1.40$0.26$4.40200KLong-horizon engineering tasks
Qwen3.7 Plus, up to 256K input¥2Not listed¥81MAlibaba Cloud production workloads
Qwen3.7 Max¥12Not listed¥361MPremium Qwen flagship

MiniMax M3 input above 512K costs $0.60/M input, $0.12/M cached input, and $2.40/M output. Qwen3.7 Plus input from 256K to 1M costs ¥6/M input and ¥24/M output.

What Changed

MiniMax M3 makes long context affordable

MiniMax M3 launched on June 1, 2026 with a 1M-token context window, native image and video input, tool use, and a very large maximum output allowance. Its $0.30/$1.20 standard international price makes it one of the most interesting choices for long codebases and multimodal agents.

GLM-5.1 focuses on sustained execution

Z.AI positions GLM-5.1 around long-horizon coding and agent work rather than short benchmark prompts. Its international API price is materially higher than MiniMax, DeepSeek, or MiMo, so teams should test whether fewer retries and better task completion justify the premium.

Qwen has moved far beyond Qwen 2.5

Qwen3.7 Plus and Max now offer 1M context through Alibaba Cloud Model Studio. Plus is the more practical price-performance route; Max is a premium model and should be reserved for tasks where its quality difference is measurable.

WorkloadStarting Model
Classification, extraction, simple tool callsDeepSeek V4 Flash
Cost-sensitive coding agentXiaomi MiMo-V2.5
Long codebase, image/video input, long outputMiniMax M3
Multi-hour engineering workflowGLM-5.1
Existing Alibaba Cloud stackQwen3.7 Plus

Do not route everything to one flagship. Measure task success rate, total generated tokens, retries, latency, and human repair time. A model with a higher token price can still be cheaper per completed task, but only your workload can prove that.

Use the AI Model Pricing Calculator to compare the USD-priced models in the canonical dataset.

Official sources checked: Z.AI pricing, GLM-5.1 docs, MiniMax pay-as-you-go pricing, MiniMax M3 announcement, Alibaba Cloud Model Studio pricing, DeepSeek pricing, and Xiaomi MiMo pricing.

Related Posts