AIME 2026

Progress Over Time

Interactive timeline showing model performance evolution on AIME 2026

State-of-the-art frontier
Open
Proprietary

AIME 2026 Leaderboard

17 models
ContextCostLicense
1
Zhipu AI
Zhipu AI
753B1.0M$0.95 / $3.00
2
Moonshot AI
Moonshot AI
1.0T262K$0.75 / $3.50
3
Zhipu AI
Zhipu AI
754B200K$1.40 / $4.40
3
Alibaba Cloud / Qwen Team
Alibaba Cloud / Qwen Team
1.0M$0.50 / $3.00
51.0T
6
ByteDance
ByteDance
256K$0.50 / $3.00
7
Alibaba Cloud / Qwen Team
Alibaba Cloud / Qwen Team
28B262K$0.60 / $3.60
8
Alibaba Cloud / Qwen Team
Alibaba Cloud / Qwen Team
35B
9
10
Alibaba Cloud / Qwen Team
Alibaba Cloud / Qwen Team
397B
1131B262K$0.13 / $0.38
1225B262K$0.13 / $0.40
12
ByteDance
ByteDance
1412B
1525B
168B
175B
Notice missing or incorrect data?
About this benchmark

What is AIME 2026?

All 30 problems from the 2026 American Invitational Mathematics Examination (AIME I and AIME II), testing olympiad-level mathematical reasoning with integer answers from 000-999. Used as an AI benchmark to evaluate large language models' ability to solve complex mathematical problems requiring multi-step logical deductions and structured symbolic reasoning.

AIME 2026 is a text benchmark evaluating models on math and reasoning tasks. LLM Stats tracks 17 models on this benchmark, scored on a 0–1 scale. The current average is 0.8, with the leader at 1.0.

Compare leaders on the best AI for math and best AI for reasoning leaderboards.

Current leaders

GLM-5.2 from Zhipu AI currently leads the AIME 2026 leaderboard with a score of 0.992 across 17 evaluated AI models.

1GLM-5.2Zhipu AI99.2%
2Kimi K2.6Moonshot AI96.4%
3GLM-5.1Zhipu AI95.3%

FAQ

Common questions about the AIME 2026 benchmark and leaderboard.

What is the AIME 2026 benchmark?

All 30 problems from the 2026 American Invitational Mathematics Examination (AIME I and AIME II), testing olympiad-level mathematical reasoning with integer answers from 000-999. Used as an AI benchmark to evaluate large language models' ability to solve complex mathematical problems requiring multi-step logical deductions and structured symbolic reasoning.

What is the AIME 2026 leaderboard?

The AIME 2026 leaderboard ranks 17 AI models based on their performance on this benchmark. Currently, GLM-5.2 by Zhipu AI leads with a score of 0.992. The average score across all models is 0.846.

What is the highest AIME 2026 score?

The highest AIME 2026 score is 0.992, achieved by GLM-5.2 from Zhipu AI.

How many models are evaluated on AIME 2026?

17 models have been evaluated on the AIME 2026 benchmark, with 0 verified results and 17 self-reported results.

What categories does AIME 2026 cover?

AIME 2026 is categorized under math and reasoning. The benchmark evaluates text models.

What is the best open-source model on AIME 2026?

GLM-5.2 by Zhipu AI is the top-ranked open-source model on AIME 2026, with a score of 0.992 (rank #1).

Which model offers the best value on AIME 2026?

Among models scoring within 10% of the leader, Qwen3.6 Plus from Alibaba Cloud / Qwen Team is the cheapest, at $0.50 per million input tokens with a score of 0.953.

How recent are the AIME 2026 leaderboard results?

The AIME 2026 leaderboard was last updated in July 2026 and currently includes 17 evaluated models.