LLM Leaderboard 2026

Compare 300+ AI models by the LLM Stats Score — intelligence, speed and price, updated continuously from public benchmarks and live API metrics.

Select 2-4 models to compare
← Scroll →
🇺🇸
Closed
1M$5.00$25.0020c/s2,127
58.0
52.8
43.0
36.7
45.5
35.3
34.1
34.4
32.0
29.6
13.8
91.3%
99.8%
80.8%
68.8%
91.1%
84.0%
77.4%
77.3%
62.7%
53.1%
72.7%
76.0%
-2.7s-
Feb. 2026
Anthropic
🇺🇸
Closed
1.1M$5.00$30.0042c/s2,115
60.7
48.4
50.1
36.0
31.8
44.8
39.0
26.9
28.4
93.6%
85.0%
84.4%
83.2%
75.3%
52.2%
55.6%
35.4%
74.0%
58.6%
-14.2sDec. 2025
Apr. 2026
OpenAI
🇺🇸
Closed
1.0M$2.50$15.00126c/s2,102
58.0
55.5
42.7
37.4
40.5
32.8
15.1
3.9
3.9
28.9
94.3%
80.6%
77.1%
92.6%
85.9%
80.5%
69.2%
51.4%
26.3%
59.0%
33.5%
54.2%
-5.9sJan. 2025
Feb. 2026
Google
🇺🇸
Closed
1M$5.00$25.0036c/s1,923
57.8
52.4
48.6
30.1
43.4
37.9
17.0
36.5
36.8
94.2%
87.6%
91.5%
79.3%
91.0%
77.3%
54.7%
64.3%
-1.1s-
Apr. 2026
Anthropic
🇺🇸
Closed
1M$10.00$50.0036c/s1,899
70.5
56.1
56.7
50.0
36.4
38.1
32.3
95.0%
64.5%
80.0%
-16.8s-
Jun. 2026
Anthropic
🇨🇳
Open
200k$1.40$4.40115c/s1,741
52.3
46.7
42.4
28.9
38.8
29.0
86.2%
79.3%
71.8%
52.3%
40.7%
58.4%
75412.2s-
Apr. 2026
ZAI
🇺🇸
Closed
1M$2.50$15.0073c/s1,735
54.6
46.7
42.5
30.6
36.5
36.9
35.5
19.4
21.8
38.6
92.8%
73.3%
82.7%
81.2%
67.2%
39.8%
54.6%
47.6%
57.7%
-1.4s-
Mar. 2026
OpenAI
🇺🇸
Closed
1M$0.50$3.00168c/s1,714
48.4
51.3
30.2
32.5
23.7
10.4
42.1
90.4%
99.7%
78.0%
33.6%
91.8%
80.3%
81.2%
69.1%
57.4%
43.5%
68.7%
49.4%
22.1%
-1.6sJan. 2025
Dec. 2025
Google
🇨🇳
Closed
1M$1.25$3.7572c/s1,677
59.4
54.7
47.6
33.8
37.6
60.8
60.8
60.0
92.4%
80.4%
90.3%
76.4%
41.4%
53.5%
60.6%
-13.0s-
May 2026
Qwen
🇺🇸
Closed
200k$3.00$15.0046c/s1,677
47.2
44.1
36.0
24.1
36.6
33.1
29.3
14.2
34.3
31.2
25.5
89.9%
79.6%
58.3%
89.3%
74.7%
75.6%
61.3%
49.0%
72.5%
-924ms-
Feb. 2026
Anthropic
🇺🇸
Closed
1.0M$1.50$9.00144c/s1,632
55.7
40.6
45.8
40.0
43.1
19.6
30.6
32.7
72.1%
84.2%
83.6%
83.6%
40.2%
56.5%
26.6%
55.1%
-2.4sJan. 2026
May 2026
Google
🇺🇸
Closed
---110c/s1,614
53.1
43.4
39.2
36.6
29.0
30.1
28.4
87.0%
80.9%
37.6%
90.8%
62.3%
66.3%
-1.1sMar. 2025
Nov. 2025
Anthropic
🇨🇳
Open
200k$1.00$3.2091c/s1,591
50.3
35.5
25.7
25.8
77.8%
75.9%
67.8%
7449.2s-
Feb. 2026
ZAI
🇺🇸
Closed
---1,579
49.0
51.9
31.8
33.4
19.2
13.7
49.3
91.9%
100.0%
76.2%
31.1%
91.8%
81.4%
81.0%
72.7%
45.8%
72.1%
26.3%
-Jan. 2025
Nov. 2025
Google
🇺🇸
Closed
1M$5.00$25.0044c/s1,553
64.3
53.8
51.7
39.1
47.1
42.6
32.2
28.2
35.4
29.0
93.6%
88.6%
84.3%
89.9%
87.9%
82.2%
57.9%
59.9%
69.2%
-1.0s-
May 2026
Anthropic
🇨🇳
Open
262.1k$0.95$4.0032c/s1,553
57.2
50.0
43.2
37.2
37.1
31.4
90.5%
80.2%
86.3%
86.7%
80.1%
36.4%
50.0%
52.2%
27.9%
58.6%
100012.8s-
Apr. 2026
MoonshotAI
🇺🇸
Closed
400k$1.75$14.00178c/s1,534
52.7
50.2
34.2
25.8
34.1
34.8
28.4
43.9
92.4%
100.0%
80.0%
52.9%
89.6%
65.8%
82.1%
79.5%
86.3%
60.6%
34.5%
46.3%
40.3%
-28.7sAug. 2025
Dec. 2025
OpenAI
🇨🇳
Open
---41c/s1,462
49.6
47.7
30.5
29.6
34.9
13.9
37.0
47.7
47.7
46.7
87.6%
96.1%
76.8%
74.9%
77.5%
78.5%
50.2%
48.7%
50.7%
100050.1s-
Jan. 2026
MoonshotAI
🇺🇸
Closed
200k$3.00$15.0043c/s1,420
40.0
32.8
31.2
33.6
18.3
33.4
21.8
16.6
83.4%
87.0%
89.1%
61.4%
50.0%
86.2%
-1.3sJan. 2025
Sep. 2025
Anthropic
🇺🇸
Closed
400k$0.75$4.50250c/s1,411
40.3
35.3
33.1
23.9
27.9
24.5
8.1
34.4
88.0%
76.6%
57.7%
28.2%
42.9%
33.6%
54.4%
-1.3sAug. 2025
Mar. 2026
OpenAI
🇺🇸
Closed
400k$5.00$30.00210c/s1,380
41.8
20.9
29.0
31.8
85.6%
81.2%
81.6%
76.0%
-961msAug. 2025
May 2026
OpenAI
🇺🇸
Closed
400k$1.75$14.00188c/s1,379
53.7
42.7
26.0
39.0
56.8%
-1.1s-
Feb. 2026
OpenAI
🇺🇸
Closed
---1,301
48.5
40.5
87.3%
94.6%
-Sep. 2024
Aug. 2025
OpenAI
🇨🇳
Open
262.1k$0.60$3.6096c/s1,294
48.4
45.8
29.6
25.4
23.1
27.6
22.7
37.5
53.9
54.0
53.2
88.4%
76.4%
88.5%
69.0%
28.7%
38.3%
39719.5s-
Feb. 2026
Qwen
🇨🇳
Open
1.0M$1.74$3.4828c/s1,291
55.5
53.3
42.8
32.0
33.5
33.2
19.8
40.1
40.3
48.6
90.1%
80.6%
83.4%
73.6%
48.2%
57.9%
51.8%
55.4%
160038.6s-
Apr. 2026
DeepSeek
🇺🇸
Open
262.1k$0.14$0.40124c/s1,284
42.7
39.2
26.9
20.0
24.1
43.0
43.0
37.5
84.3%
88.4%
76.9%
26.5%
30.712.5sJan. 2025
Apr. 2026
Google
🇺🇸
Closed
400k$1.25$10.00147c/s1,250
47.0
41.6
30.6
22.0
28.9
34.3
25.2
47.6
88.1%
94.0%
76.3%
85.4%
26.7%
-1.4sSep. 2024
Nov. 2025
OpenAI
🇺🇸
Closed
400k$1.75$14.00277c/s1,230
49.3
37.9
25.9
56.4%
-1.3s-
Jan. 2026
OpenAI
🇺🇸
Open
262.1k$0.13$0.4097c/s1,229
36.1
32.5
20.7
18.4
18.0
33.1
33.1
32.8
82.3%
86.3%
73.8%
17.2%
25.22.7sJan. 2025
Apr. 2026
Google
🇺🇸
Closed
---1,189
38.0
30.7
27.8
23.0
21.1
22.6
23.1
80.9%
78.0%
74.5%
89.5%
43.3%
82.4%
--
Aug. 2025
Anthropic
Showing 130 of 316 models

LLM Leaderboard highlights

The LLM Leaderboard ranks 300+ AI models by intelligence, output speed, latency and per-token pricing, aggregated into the LLM Stats Score. Updated continuously from provider APIs and verified benchmarks. See the LLM Stats Score methodology for how rankings are computed.

FAQ

Common questions about the llm leaderboard

What is the best LLM right now?

Based on coding-arena performance — the most discriminating signal at the frontier — Claude Opus 4.6 currently leads. For knowledge-heavy reasoning (GPQA Diamond), Claude Mythos Preview scores highest. Choose by axis rather than a single ranking — see the highlights above for per-metric leaders.

How does the LLM Leaderboard rank models?

Models are sorted by coding-arena score (when available), then by GPQA Diamond. Each row aggregates verified benchmark results, provider-reported pricing, and live performance metrics (output throughput and time-to-first-token) sampled across the major API providers. See the LLM Stats Score methodology for the full weighting and refresh cadence.

How many models are tracked?

This leaderboard tracks 316 canonical models across every major lab and provider. New releases typically appear within hours.

Where does pricing data come from?

Per-model input/output pricing is pulled from each provider's public API price list and verified against billing samples from the LLM Stats proxy. When a model is hosted by multiple providers, the cheapest available rate is shown by default.

How is performance measured?

Output throughput (tokens/second) and time-to-first-token are measured by routing standardized prompts through each provider's API and averaging over a 7-day rolling window. Numbers update hourly. Per-model splits live on each model detail page.

How often does the data update?

Pricing and model metadata revalidate every hour. Live performance metrics update on a 7-day rolling average. Benchmark scores update when a new verified result is published or a new evaluation lands on LLM Stats.