
Qwen3 30B A3B
QwenOverview
Qwen3-30B-A3B is a smaller Mixture-of-Experts (MoE) model from the Qwen3 series by Alibaba, with 30.5 billion total parameters and 3.3 billion activated parameters. Features hybrid thinking/non-thinking modes, support for 119 languages, and enhanced agent capabilities. It aims to outperform previous models like QwQ-32B while using significantly fewer activated parameters.
Qwen3 30B A3B was released on April 29, 2025. API access is available through DeepInfra, Novita, Fireworks.
Performance
Timeline
Other Details
Related Models
Compare Qwen3 30B A3B to other models by quality (GPQA score) vs cost. Higher scores and lower costs represent better value.
Performance visualization loading...
Gathering benchmark data from similar models
Benchmarks
Qwen3 30B A3B Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing, performance, and capabilities for Qwen3 30B A3B across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
DeepInfra | $0.10 | $0.30 | 128.0K | 128.0K | 0.84 | 82.57 tok/s | — | Text Image Audio Video | Text Image Audio Video |
Novita | $0.10 | $0.44 | 128.0K | 128.0K | 0.73 | 88.84 tok/s | — | Text Image Audio Video | Text Image Audio Video |
Fireworks | $0.89 | $0.89 | 128.0K | 128.0K | 0.66 | 122.4 tok/s | — | Text Image Audio Video | Text Image Audio Video |
Price Comparison for Qwen3 30B A3B
Price per 1M input tokens (USD), lower is better
Throughput Comparison for Qwen3 30B A3B
Tokens per second, higher is better
Latency Comparison for Qwen3 30B A3B
Time to first token (s), lower is better
Qwen3 30B A3B API Providers: Price vs Throughput
Example Outputs
Recent Posts
Recent Reviews
API Access
API Access Coming Soon
API access for Qwen3 30B A3B will be available soon through our gateway.
FAQ
Common questions about Qwen3 30B A3B
