ZAI logo

GLM-4.7-Flash

Overview

GLM-4.7-Flash is a high-speed, cost-efficient variant of GLM-4.7 optimized for fast inference and lower latency. It retains the coding-centric capabilities of GLM-4.7 including thinking before acting, preserved reasoning across turns, and per-request thinking control for speed or accuracy trade-offs. Ideal for applications requiring quick responses while maintaining strong performance on coding, agentic workflows, and general reasoning tasks.

GLM-4.7-Flash was released on January 19, 2026. API access is available through ZAI.

Performance

Timeline

ReleasedUnknown
Knowledge CutoffUnknown

Specifications

Parameters
Unknown
License
MIT
Training Data
Unknown

Benchmarks

GLM-4.7-Flash Performance Across Datasets

Scores sourced from the model's scorecard, paper, or official blog posts

LLM Stats Logollm-stats.com - Mon Jan 19 2026
Notice missing or incorrect data?Start an Issue discussion

Pricing

Pricing, performance, and capabilities for GLM-4.7-Flash across different providers:

ProviderInput ($/M)Output ($/M)Max InputMax OutputLatency (s)ThroughputQuantizationInputOutput
ZAI logo
ZAI
$0.07$0.40128.0K16.4K
2.0
50.0 c/s
Text
Image
Audio
Video
Text
Image
Audio
Video

API Access

API Access Coming Soon

API access for GLM-4.7-Flash will be available soon through our gateway.

Recent Posts

Recent Reviews

Blog Posts

FAQ

Common questions about GLM-4.7-Flash

GLM-4.7-Flash was released on January 19, 2026.