DevTk.AI

GLM-5V Turbo

Z.AI

Updated June 2026. GLM-5V Turbo by Z.AI: $1.20/M cache-miss input, $4.00/M output tokens. Cached input: $0.24/M. 200K context, 128K max output. Vision & Video. Free calculator + compare 40+ models.

Input Price

$1.20

cache miss / 1M tokens

Cached Input

$0.24

per 1M tokens

Output Price

$4.00

per 1M tokens

Context Window

200K

tokens

Specifications

ProviderZ.AI
Model IDglm-5v-turbo
Input Price$1.2 / 1M cache-miss tokens
Cached Input Price$0.24 / 1M tokens
Output Price$4 / 1M tokens
Context Window200K tokens
Max Output128K tokens
Capabilities
textvisionvideofunction_callingstructured_output
Release Date2026-05
Pricing SourceOfficial Z.AI pricing
Price Verified2026-06-14 · International Z.AI API pricing is listed in USD. China BigModel pricing is listed separately in RMB.
NotesCurrent multimodal GLM agent model for text, image, and video input.

Monthly Cost Estimates

Estimated monthly costs based on different daily usage levels (assuming 50% input / 50% output split). Input estimates use cache-miss pricing, so cache-heavy workloads can be lower.

Daily TokensMonthly CostAnnual Cost
10K $0.78 $9.36
50K $3.90 $46.80
100K $7.80 $93.60
500K $39.00 $468.00
1.0M $78.00 $936.00

About GLM-5V Turbo

GLM-5V Turbo is a large language model by Z.AI. It features a 200K token context window with up to 128K tokens of output per request. The model supports 5 capabilities: text, vision, video, function_calling, structured_output.

At $1.2 per million cache-miss input tokens and $4 per million output tokens, GLM-5V Turbo is positioned as a mid-range option in the Z.AI lineup. Repeated prefix input can be charged at $0.24 per million cached tokens. Use our Token Counter to estimate how many tokens your prompts use, and our Pricing Calculator to compare costs across all models.

GLM-5V Turbo Key Details

  • Pricing: $1.2/M cache-miss input tokens, $0.24/M cached input tokens, $4/M output tokens
  • Context window: 200K tokens — suitable for large documents and codebases
  • Max output: 128K tokens per response
  • Capabilities: text, vision, video, function_calling, structured_output
  • Highlights: Current multimodal GLM agent model for text, image, and video input.
  • Released: 2026-05

Other Z.AI Models

Similar Price Range

Related Tools

FAQ

How much does GLM-5V Turbo cost?

GLM-5V Turbo costs $1.2 per million cache-miss input tokens and $4 per million output tokens. Cached input costs $0.24 per million tokens. For a typical workload of 100K input tokens/day and 50K output tokens/day, expect approximately $9.60/month before cache-hit savings.

What is GLM-5V Turbo's context window?

GLM-5V Turbo supports a context window of 200K tokens. This means your combined input prompt and output response can be up to 200K tokens. The maximum output per response is 128K tokens.

Is GLM-5V Turbo good for my use case?

GLM-5V Turbo supports text, vision, video, function_calling, structured_output. As a mid-range model, it balances capability and cost for most production use cases. Use our Pricing Calculator to compare with alternatives.