MathArena Apex
Progress Over Time
Interactive timeline showing model performance evolution on MathArena Apex
MathArena Apex Leaderboard
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
| 1 | DeepSeek | 1.6T | 1.0M | $1.60 / $3.20 | ||
| 2 | DeepSeek | 284B | 1.0M | $0.10 / $0.20 | ||
| 3 | Alibaba Cloud / Qwen Team | — | 1.0M | $1.25 / $3.75 | ||
| 4 | ByteDance | — | — | — | ||
| 5 | Seed 2.1 ProNew ByteDance | — | — | — | ||
| 6 | Google | — | — | — |
What is MathArena Apex?
MathArena Apex is a challenging math contest benchmark featuring the most difficult mathematical problems designed to test advanced reasoning and problem-solving abilities of AI models. It focuses on olympiad-level mathematics and complex multi-step mathematical reasoning.
MathArena Apex is a text benchmark evaluating models on math and reasoning tasks. LLM Stats tracks 6 models on this benchmark, scored on a 0–1 scale. The current average is 0.5, with the leader at 0.9.
Compare leaders on the best AI for math and best AI for reasoning leaderboards.
Current leaders
DeepSeek-V4-Pro-Max from DeepSeek currently leads the MathArena Apex leaderboard with a score of 0.902 across 6 evaluated AI models.
FAQ
Common questions about the MathArena Apex benchmark and leaderboard.