Llama 3.1 Nemotron Ultra 253B v1
Overview
A 253B parameter derivative of Meta Llama 3.1 405B Instruct, developed by NVIDIA using Neural Architecture Search (NAS) and vertical compression. It underwent multi-phase post-training (SFT for Math, Code, Reasoning, Chat, Tool Calling; RL with GRPO) to enhance reasoning and instruction-following. Optimized for accuracy/efficiency tradeoff on NVIDIA GPUs. Supports 128k context.
Llama 3.1 Nemotron Ultra 253B v1 was released on April 7, 2025.
Performance
Timeline
Specifications
Benchmarks
Llama 3.1 Nemotron Ultra 253B v1 Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing, performance, and capabilities for Llama 3.1 Nemotron Ultra 253B v1 across different providers:
API Access
API Access Coming Soon
API access for Llama 3.1 Nemotron Ultra 253B v1 will be available soon through our gateway.
Recent Posts
Recent Reviews
FAQ
Common questions about Llama 3.1 Nemotron Ultra 253B v1