GLM-5.1
Z.AIUpdated June 2026. GLM-5.1 by Z.AI: $1.40/M cache-miss input, $4.40/M output tokens. Cached input: $0.26/M. 200K context, 128K max output. Function Calling & JSON Mode. Free calculator + compare 40+ models.
Input Price
$1.40
cache miss / 1M tokens
Cached Input
$0.26
per 1M tokens
Output Price
$4.40
per 1M tokens
Context Window
200K
tokens
Specifications
| Provider | Z.AI |
| Model ID | glm-5.1 |
| Input Price | $1.4 / 1M cache-miss tokens |
| Cached Input Price | $0.26 / 1M tokens |
| Output Price | $4.4 / 1M tokens |
| Context Window | 200K tokens |
| Max Output | 128K tokens |
| Capabilities | textfunction_callingstructured_output |
| Release Date | 2026-04-07 |
| Pricing Source | Official Z.AI pricing |
| Price Verified | 2026-06-14 · International Z.AI API pricing is listed in USD. China BigModel pricing is listed separately in RMB. |
| Notes | Latest generally documented GLM flagship API model for long-horizon coding and agent tasks. Z.AI says it can work autonomously on a single task for up to eight hours. |
Monthly Cost Estimates
Estimated monthly costs based on different daily usage levels (assuming 50% input / 50% output split). Input estimates use cache-miss pricing, so cache-heavy workloads can be lower.
| Daily Tokens | Monthly Cost | Annual Cost |
|---|---|---|
| 10K | $0.87 | $10.44 |
| 50K | $4.35 | $52.20 |
| 100K | $8.70 | $104.40 |
| 500K | $43.50 | $522.00 |
| 1.0M | $87.00 | $1044.00 |
About GLM-5.1
GLM-5.1 is a large language model by Z.AI. It features a 200K token context window with up to 128K tokens of output per request. The model supports 3 capabilities: text, function_calling, structured_output.
At $1.4 per million cache-miss input tokens and $4.4 per million output tokens, GLM-5.1 is positioned as a mid-range option in the Z.AI lineup. Repeated prefix input can be charged at $0.26 per million cached tokens. Use our Token Counter to estimate how many tokens your prompts use, and our Pricing Calculator to compare costs across all models.
GLM-5.1 Key Details
- Pricing: $1.4/M cache-miss input tokens, $0.26/M cached input tokens, $4.4/M output tokens
- Context window: 200K tokens — suitable for large documents and codebases
- Max output: 128K tokens per response
- Capabilities: text, function_calling, structured_output
- Highlights: Latest generally documented GLM flagship API model for long-horizon coding and agent tasks. Z.AI says it can work autonomously on a single task for up to eight hours.
- Released: 2026-04-07
Other Z.AI Models
Similar Price Range
Related Tools
FAQ
How much does GLM-5.1 cost?
GLM-5.1 costs $1.4 per million cache-miss input tokens and $4.4 per million output tokens. Cached input costs $0.26 per million tokens. For a typical workload of 100K input tokens/day and 50K output tokens/day, expect approximately $10.80/month before cache-hit savings.
What is GLM-5.1's context window?
GLM-5.1 supports a context window of 200K tokens. This means your combined input prompt and output response can be up to 200K tokens. The maximum output per response is 128K tokens.
Is GLM-5.1 good for my use case?
GLM-5.1 supports text, function_calling, structured_output. As a mid-range model, it balances capability and cost for most production use cases. Use our Pricing Calculator to compare with alternatives.