- Organizations
- DeepSeek
- DeepSeek-V4-Flash-Max
DeepSeek-V4-Flash-Max: Benchmarks, Pricing & Context Window
DeepSeek-V4-Flash-Max is a language model from DeepSeek, released in April 2026.
DeepSeek-V4-Flash-Max is the maximum reasoning effort mode of DeepSeek-V4-Flash, a 284B-parameter MoE model with 13B activated parameters and a 1M-token context window. Sharing the V4 series' hybrid attention architecture (Compressed
DeepSeek-V4-Flash-Max pricing
Providers
DeepSeek-V4-Flash-Max starts at $0.140 per million input tokens and $0.280 per million output tokens via DeepSeek.
| Provider | Input $/M | Output $/M | Max Input | Max Output | Latency s | Throughput | Quant | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
| $0.140 | $0.280 | 1.0M | 393.2K | 6.04 | 49 c/s | — |
DeepSeek-V4-Flash-Max API
API access coming soon
DeepSeek-V4-Flash-Max will be available through our gateway shortly.
DeepSeek-V4-Flash-Max examples
Recent arena outputs from DeepSeek-V4-Flash-Max, picked from the highest-ranked matchups.
DeepSeek-V4-Flash-Max license
DeepSeek-V4-Flash-Max is released under the MIT license, which permits commercial use, has 284.0B parameters.
- License
- MIT
- Commercial use allowed
- Parameters
- 284.0B
MIT License - allows commercial use
FAQ
Common questions about DeepSeek-V4-Flash-Max.