
Qwen3 32B
QwenOverview
Qwen3-32B is a large language model from Alibaba's Qwen3 series. It features 32.8 billion parameters, a 128k token context window, support for 119 languages, and hybrid thinking modes allowing switching between deep reasoning and fast responses. It demonstrates strong performance in reasoning, instruction-following, and agent capabilities.
Qwen3 32B was released on April 29, 2025. API access is available through DeepInfra, Novita, Sambanova.
Performance
Timeline
Other Details
Related Models
Compare Qwen3 32B to other models by quality (GPQA score) vs cost. Higher scores and lower costs represent better value.
Performance visualization loading...
Gathering benchmark data from similar models
Benchmarks
Qwen3 32B Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing, performance, and capabilities for Qwen3 32B across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
DeepInfra | $0.10 | $0.30 | 128.0K | 128.0K | 1.19 | 26.95 tok/s | — | Text Image Audio Video | Text Image Audio Video |
Novita | $0.10 | $0.44 | 128.0K | 128.0K | 0.93 | 32.43 tok/s | — | Text Image Audio Video | Text Image Audio Video |
Sambanova | $0.40 | $0.80 | 128.0K | 128.0K | 1.08 | 327.7 tok/s | — | Text Image Audio Video | Text Image Audio Video |
Price Comparison for Qwen3 32B
Price per 1M input tokens (USD), lower is better
Throughput Comparison for Qwen3 32B
Tokens per second, higher is better
Latency Comparison for Qwen3 32B
Time to first token (s), lower is better
Qwen3 32B API Providers: Price vs Throughput
Example Outputs
Recent Posts
Recent Reviews
API Access
API Access Coming Soon
API access for Qwen3 32B will be available soon through our gateway.
FAQ
Common questions about Qwen3 32B
