GLM-5 Turbo
Z.AIUpdated June 2026. GLM-5 Turbo by Z.AI: $1.20/M cache-miss input, $4.00/M output tokens. Cached input: $0.24/M. 200K context, 128K max output. Function Calling & JSON Mode. Free calculator + compare 40+ models.
Input Price
$1.20
cache miss / 1M tokens
Cached Input
$0.24
per 1M tokens
Output Price
$4.00
per 1M tokens
Context Window
200K
tokens
Specifications
| Provider | Z.AI |
| Model ID | glm-5-turbo |
| Input Price | $1.2 / 1M cache-miss tokens |
| Cached Input Price | $0.24 / 1M tokens |
| Output Price | $4 / 1M tokens |
| Context Window | 200K tokens |
| Max Output | 128K tokens |
| Capabilities | textfunction_callingstructured_output |
| Release Date | 2026-05 |
| Pricing Source | Official Z.AI pricing |
| Price Verified | 2026-06-14 · International Z.AI API pricing is listed in USD. China BigModel pricing is listed separately in RMB. |
| Notes | Current faster GLM model optimized for tool use and agent workflows. |
Monthly Cost Estimates
Estimated monthly costs based on different daily usage levels (assuming 50% input / 50% output split). Input estimates use cache-miss pricing, so cache-heavy workloads can be lower.
| Daily Tokens | Monthly Cost | Annual Cost |
|---|---|---|
| 10K | $0.78 | $9.36 |
| 50K | $3.90 | $46.80 |
| 100K | $7.80 | $93.60 |
| 500K | $39.00 | $468.00 |
| 1.0M | $78.00 | $936.00 |
About GLM-5 Turbo
GLM-5 Turbo is a large language model by Z.AI. It features a 200K token context window with up to 128K tokens of output per request. The model supports 3 capabilities: text, function_calling, structured_output.
At $1.2 per million cache-miss input tokens and $4 per million output tokens, GLM-5 Turbo is positioned as a mid-range option in the Z.AI lineup. Repeated prefix input can be charged at $0.24 per million cached tokens. Use our Token Counter to estimate how many tokens your prompts use, and our Pricing Calculator to compare costs across all models.
GLM-5 Turbo Key Details
- Pricing: $1.2/M cache-miss input tokens, $0.24/M cached input tokens, $4/M output tokens
- Context window: 200K tokens — suitable for large documents and codebases
- Max output: 128K tokens per response
- Capabilities: text, function_calling, structured_output
- Highlights: Current faster GLM model optimized for tool use and agent workflows.
- Released: 2026-05
Other Z.AI Models
Similar Price Range
Related Tools
FAQ
How much does GLM-5 Turbo cost?
GLM-5 Turbo costs $1.2 per million cache-miss input tokens and $4 per million output tokens. Cached input costs $0.24 per million tokens. For a typical workload of 100K input tokens/day and 50K output tokens/day, expect approximately $9.60/month before cache-hit savings.
What is GLM-5 Turbo's context window?
GLM-5 Turbo supports a context window of 200K tokens. This means your combined input prompt and output response can be up to 200K tokens. The maximum output per response is 128K tokens.
Is GLM-5 Turbo good for my use case?
GLM-5 Turbo supports text, function_calling, structured_output. As a mid-range model, it balances capability and cost for most production use cases. Use our Pricing Calculator to compare with alternatives.