
Qwen3-Next-80B-A3B-Instruct
QwenOverview
Qwen3-Next-80B-A3B-Instruct is the first in the Qwen3-Next series, featuring groundbreaking architectural innovations. It uses Hybrid Attention combining Gated DeltaNet and Gated Attention for efficient ultra-long context modeling, High-Sparsity MoE with 512 experts (10 activated + 1 shared) achieving extreme low activation ratio, and Multi-Token Prediction for improved performance and faster inference. With 80B total parameters and only 3B activated, it outperforms Qwen3-32B-Base with 10% training cost and 10x throughput for 32K+ contexts. The model performs on par with Qwen3-235B-A22B-Instruct-2507 while excelling at ultra-long-context tasks up to 256K tokens (extensible to 1M with YaRN). Architecture: 48 layers, 15T training tokens, hybrid layout of 12*(3*(Gated DeltaNet->MoE)->(Gated Attention->MoE)).
Qwen3-Next-80B-A3B-Instruct was released on September 10, 2025. API access is available through Novita.
Performance
Timeline
Other Details
Related Models
Compare Qwen3-Next-80B-A3B-Instruct to other models by quality (GPQA score) vs cost. Higher scores and lower costs represent better value.
Performance visualization loading...
Gathering benchmark data from similar models
Benchmarks
Qwen3-Next-80B-A3B-Instruct Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing, performance, and capabilities for Qwen3-Next-80B-A3B-Instruct across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
Novitabf16 | $0.15 | $1.50 | 65.5K | 65.5K | — | — | bf16 | Text Image Audio Video | Text Image Audio Video |
Example Outputs
Recent Posts
Recent Reviews
API Access
API Access Coming Soon
API access for Qwen3-Next-80B-A3B-Instruct will be available soon through our gateway.
FAQ
Common questions about Qwen3-Next-80B-A3B-Instruct
