GLM-4.7
Overview
GLM 4.7 is a coding‑centric model that thinks before acting, preserves its reasoning across turns, and lets you control thinking per request for speed or accuracy. It upgrades agentic workflows with stronger multi‑step tool use, better terminal and multilingual coding, and a noticeable jump in UI output quality for modern, clean webpages and slides. You can use it in popular coding agents, call it via the Z.ai API, and even run it locally with public weights on HuggingFace and ModelScope using vLLM or SGLang.
GLM-4.7 was released on December 22, 2025. API access is available through Novita.
Performance
Timeline
Specifications
Benchmarks
GLM-4.7 Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing, performance, and capabilities for GLM-4.7 across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
Novitabf16 | $0.60 | $2.20 | 204.8K | 131.1K | — | — | bf16 | Text Image Audio Video | Text Image Audio Video |
API Access
API Access Coming Soon
API access for GLM-4.7 will be available soon through our gateway.
Recent Posts
Recent Reviews
Blog Posts
FAQ
Common questions about GLM-4.7

