DeepSeek R1 Zero
Overview
DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.
DeepSeek R1 Zero was released on January 20, 2025.
Performance
Timeline
Specifications
Benchmarks
DeepSeek R1 Zero Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing, performance, and capabilities for DeepSeek R1 Zero across different providers:
API Access
API Access Coming Soon
API access for DeepSeek R1 Zero will be available soon through our gateway.
Recent Posts
Recent Reviews
FAQ
Common questions about DeepSeek R1 Zero
