Nemotron 3 Nano (30B A3B)
Overview
Overview
Nemotron 3 Nano is a 31.6B hybrid MoE model optimized for fast, long‑context agentic reasoning. It mixes Mamba‑2 and Transformer layers with a sparse MoE router (~3.6B active params per token) to deliver up to 4× higher throughput than Nemotron 2 and strong accuracy across math, coding, and tools. It supports a 1M‑token context window, offers Reasoning ON/OFF and a thinking‑budget to control costs, and ships with open weights, data, and RL tooling (NeMo Gym/RL). Released Dec 15, 2025 under the NVIDIA Open Model License, it’s built as the efficient backbone for multi‑agent systems at scale.
Nemotron 3 Nano (30B A3B) was released on December 15, 2025. API access is available through DeepInfra.
Performance
Timeline
Specifications
Benchmarks
Benchmarks
Nemotron 3 Nano (30B A3B) Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing
Pricing, performance, and capabilities for Nemotron 3 Nano (30B A3B) across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
DeepInfrabfloat16 | $0.06 | $0.24 | 262.1K | 262.1K | — | — | bfloat16 | Text Image Audio Video | Text Image Audio Video |
API Access
API Access Coming Soon
API access for Nemotron 3 Nano (30B A3B) will be available soon through our gateway.
Recent Posts
Recent Reviews
Blog Posts
FAQ
Common questions about Nemotron 3 Nano (30B A3B)
