- Organizations
- Nvidia
- Nemotron 3 Super (120B A12B)
Nemotron 3 Super (120B A12B): Benchmarks, Pricing & Context Window
Nemotron 3 Super (120B A12B) is a language model from Nvidia, released in March 2026.
Nemotron 3 Super is a 120B total / 12B active parameter hybrid Mamba-Attention Mixture-of-Experts model optimized for agentic reasoning, coding, planning, tool calling, and long-context analysis. It introduces LatentMoE (projecting tokens
Nemotron 3 Super (120B A12B) pricing
Providers
Nemotron 3 Super (120B A12B) starts at $0.100 per million input tokens and $0.500 per million output tokens via DeepInfra.
| Provider | Input $/M | Output $/M | Max Input | Max Output | Latency s | Throughput | Quant | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
| $0.100 | $0.500 | 262.1K | 262.1K | — | — | bfloat16 |
Nemotron 3 Super (120B A12B) API
API access coming soon
Nemotron 3 Super (120B A12B) will be available through our gateway shortly.
Nemotron 3 Super (120B A12B) examples
Recent arena outputs from Nemotron 3 Super (120B A12B), picked from the highest-ranked matchups.
Nemotron 3 Super (120B A12B) license
Nemotron 3 Super (120B A12B) is released under the NVIDIA Open Model License Agreement license, which permits commercial use, has 120.0B parameters, has a knowledge cutoff of June 2025.
- License
- NVIDIA Open Model License Agreement
- Commercial use allowed
- Parameters
- 120.0B
- Knowledge cutoff
- June 2025
NVIDIA Open Model License Agreement
FAQ
Common questions about Nemotron 3 Super (120B A12B).