DevTk.AI

GLM-5.1

Z.AI

Updated June 2026. GLM-5.1 by Z.AI: $1.40/M cache-miss input, $4.40/M output tokens. Cached input: $0.26/M. 200K context, 128K max output. Function Calling & JSON Mode. Free calculator + compare 40+ models.

Input Price

$1.40

cache miss / 1M tokens

Cached Input

$0.26

per 1M tokens

Output Price

$4.40

per 1M tokens

Context Window

200K

tokens

Specifications

ProviderZ.AI
Model IDglm-5.1
Input Price$1.4 / 1M cache-miss tokens
Cached Input Price$0.26 / 1M tokens
Output Price$4.4 / 1M tokens
Context Window200K tokens
Max Output128K tokens
Capabilities
textfunction_callingstructured_output
Release Date2026-04-07
Pricing SourceOfficial Z.AI pricing
Price Verified2026-06-14 · International Z.AI API pricing is listed in USD. China BigModel pricing is listed separately in RMB.
NotesLatest generally documented GLM flagship API model for long-horizon coding and agent tasks. Z.AI says it can work autonomously on a single task for up to eight hours.

Monthly Cost Estimates

Estimated monthly costs based on different daily usage levels (assuming 50% input / 50% output split). Input estimates use cache-miss pricing, so cache-heavy workloads can be lower.

Daily TokensMonthly CostAnnual Cost
10K $0.87 $10.44
50K $4.35 $52.20
100K $8.70 $104.40
500K $43.50 $522.00
1.0M $87.00 $1044.00

About GLM-5.1

GLM-5.1 is a large language model by Z.AI. It features a 200K token context window with up to 128K tokens of output per request. The model supports 3 capabilities: text, function_calling, structured_output.

At $1.4 per million cache-miss input tokens and $4.4 per million output tokens, GLM-5.1 is positioned as a mid-range option in the Z.AI lineup. Repeated prefix input can be charged at $0.26 per million cached tokens. Use our Token Counter to estimate how many tokens your prompts use, and our Pricing Calculator to compare costs across all models.

GLM-5.1 Key Details

  • Pricing: $1.4/M cache-miss input tokens, $0.26/M cached input tokens, $4.4/M output tokens
  • Context window: 200K tokens — suitable for large documents and codebases
  • Max output: 128K tokens per response
  • Capabilities: text, function_calling, structured_output
  • Highlights: Latest generally documented GLM flagship API model for long-horizon coding and agent tasks. Z.AI says it can work autonomously on a single task for up to eight hours.
  • Released: 2026-04-07

Other Z.AI Models

Similar Price Range

Related Tools

FAQ

How much does GLM-5.1 cost?

GLM-5.1 costs $1.4 per million cache-miss input tokens and $4.4 per million output tokens. Cached input costs $0.26 per million tokens. For a typical workload of 100K input tokens/day and 50K output tokens/day, expect approximately $10.80/month before cache-hit savings.

What is GLM-5.1's context window?

GLM-5.1 supports a context window of 200K tokens. This means your combined input prompt and output response can be up to 200K tokens. The maximum output per response is 128K tokens.

Is GLM-5.1 good for my use case?

GLM-5.1 supports text, function_calling, structured_output. As a mid-range model, it balances capability and cost for most production use cases. Use our Pricing Calculator to compare with alternatives.