- Organizations
- Qwen
- Qwen3 32B
Qwen3 32B: Benchmarks, Pricing & Context Window
Qwen3 32B is a language model from Qwen, released in April 2025.
Qwen3-32B is a large language model from Alibaba's Qwen3 series. It features 32.8 billion parameters, a 128k token context window, support for 119 languages, and hybrid thinking modes allowing switching between deep reasoning and fast
Qwen3 32B pricing
Providers
Qwen3 32B starts at $0.100 per million input tokens and $0.300 per million output tokens via DeepInfra. See all 3 providers below with their per-token pricing, latency, throughput, and modality support.
| Provider | Input $/M | Output $/M | Max Input | Max Output | Latency s | Throughput | Quant | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
| $0.100 | $0.300 | 128.0K | 128.0K | 0.84 | 57 c/s | — | |||
| $0.100 | $0.440 | 128.0K | 128.0K | 0.93 | 32 c/s | — | |||
| $0.400 | $0.800 | 128.0K | 128.0K | 1.08 | 328 c/s | — |
Qwen3 32B API
API access coming soon
Qwen3 32B will be available through our gateway shortly.
Qwen3 32B examples
Recent arena outputs from Qwen3 32B, picked from the highest-ranked matchups.
Qwen3 32B license
Qwen3 32B is released under the Apache 2.0 license, which permits commercial use, has 32.8B parameters.
- License
- Apache 2.0
- Commercial use allowed
- Parameters
- 32.8B
Apache License 2.0 - allows commercial use
FAQ
Common questions about Qwen3 32B.