DeepSeek-V3 0324
Overview
A powerful Mixture-of-Experts (MoE) language model with 671B total parameters (37B activated per token). Features Multi-head Latent Attention (MLA), auxiliary-loss-free load balancing, and multi-token prediction training. Pre-trained on 14.8T tokens with strong performance in reasoning, math, and code tasks.
DeepSeek-V3 0324 was released on March 25, 2025. API access is available through Novita.
Performance
Timeline
ReleasedUnknown
Knowledge CutoffUnknown
Specifications
Parameters
671.0B
License
MIT + Model License (Commercial use allowed)
Training Data
Unknown
Benchmarks
DeepSeek-V3 0324 Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Notice missing or incorrect data?Start an Issue discussion→
Pricing
Pricing, performance, and capabilities for DeepSeek-V3 0324 across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
Novitafp8 | $0.28 | $1.14 | 163.8K | 163.8K | — | — | fp8 | Text Image Audio Video | Text Image Audio Video |
API Access
API Access Coming Soon
API access for DeepSeek-V3 0324 will be available soon through our gateway.
Recent Posts
Recent Reviews
FAQ
Common questions about DeepSeek-V3 0324
DeepSeek-V3 0324 was released on March 25, 2025.
DeepSeek-V3 0324 has 671.0 billion parameters.
