- Organizations
- ZAI
- GLM-4.7-Flash
GLM-4.7-Flash: Benchmarks, Pricing & Context Window
GLM-4.7-Flash is a language model from ZAI, released in January 2026.
GLM-4.7-Flash is a high-speed, cost-efficient variant of GLM-4.7 optimized for fast inference and lower latency. It retains the coding-centric capabilities of GLM-4.7 including thinking before acting, preserved reasoning across turns, and
Input
Text
Output
Text
GLM-4.7-Flash pricing
Providers
GLM-4.7-Flash starts at $0.0700 per million input tokens and $0.400 per million output tokens via ZAI.
| Provider | Input $/M | Output $/M | Max Input | Max Output | Latency s | Throughput | Quant | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
| $0.0700 | $0.400 | 128.0K | 16.4K | 2.00 | 50 c/s | — |
GLM-4.7-Flash API
API access coming soon
GLM-4.7-Flash will be available through our gateway shortly.
GLM-4.7-Flash examples
Recent arena outputs from GLM-4.7-Flash, picked from the highest-ranked matchups.
GLM-4.7-Flash license
GLM-4.7-Flash is released under the MIT license, which permits commercial use, has 30.0B parameters.
- License
- MIT
- Commercial use allowed
- Parameters
- 30.0B
MIT License - allows commercial use
FAQ
Common questions about GLM-4.7-Flash.