GLM-5
Overview
Overview
GLM-5 is Zhipu AI's flagship foundation model designed for complex system engineering and long-range Agent tasks, shifting focus from coding to engineering. It features 744B total parameters (40B activated) in a Mixture of Experts architecture, trained on 28.5T tokens. GLM-5 integrates DeepSeek Sparse Attention for higher token efficiency while preserving long-context quality. It supports 200K context length and 128K max output tokens, with capabilities including thinking modes, real-time streaming, function calling, context caching, and structured output. GLM-5 approaches Claude Opus 4.5 in code-logic density and systems-engineering capability.
GLM-5 was released on February 11, 2026. API access is available through ZAI.
Performance
Timeline
Specifications
Benchmarks
Benchmarks
GLM-5 Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing
Pricing, performance, and capabilities for GLM-5 across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
ZAI | $1.00 | $3.20 | 200.0K | 128.0K | 3.0 | 30.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
API Access
API Access Coming Soon
API access for GLM-5 will be available soon through our gateway.
Recent Posts
Recent Reviews
Blog Posts
FAQ
Common questions about GLM-5
