- Organizations
- Qwen
- Qwen3 32B
Qwen3 32B: Benchmarks, Pricing & Size
Qwen3 32B is a language model from Qwen, released in April 2025.
Qwen3-32B is a large language model from Alibaba's Qwen3 series. It features 32.8 billion parameters, a 128k token context window, support for 119 languages, and hybrid thinking modes allowing switching between deep reasoning and fast
Qwen3 32B pricing
Providers
Qwen3 32B starts at $0.100 per million input tokens and $0.300 per million output tokens via DeepInfra. See all 3 providers below with their per-token pricing, latency, throughput, and modality support.
| Provider | Input $/M | Output $/M | Max Input | Max Output | Latency p95 s | Throughput P95 | Quant | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
| $0.100 | $0.300 | 128.0K | 128.0K | 0.69 | 67 c/s | — | |||
| $0.100 | $0.440 | 128.0K | 128.0K | 0.93 | 32 c/s | — | |||
| $0.400 | $0.800 | 128.0K | 128.0K | 1.08 | 328 c/s | — |
Qwen3 32B model size
Qwen3 32B has 32.8 billion parameters. See how it compares to other models in the same parameter range.
Qwen3 32B API
API access coming soon
Qwen3 32B will be available through our gateway shortly.
Qwen3 32B examples
Recent arena outputs from Qwen3 32B, picked from the highest-ranked matchups.
Qwen3 32B license
Qwen3 32B is released under the Apache 2.0 license, which permits commercial use, has 32.8B parameters.
- License
- Apache 2.0
- Commercial use allowed
- Parameters
- 32.8B
Apache License 2.0 - allows commercial use
FAQ
Common questions about Qwen3 32B.