Name: LLM Stats Leaderboard — AI Models with Live Pricing & Benchmarks
Creator: LLM Stats
License: https://llm-stats.com/legal/terms-of-service

Question 1

Which AI model ranks #1 on the LLM Leaderboard?

Accepted Answer

On the LLM Stats Leaderboard, Claude Mythos Preview currently leads on GPQA Diamond (94.6% gpqa), the most discriminating reasoning benchmark at the frontier. This AI leaderboard ranks models by the LLM Stats Score, which aggregates GPQA, SWE-Bench Verified, coding-arena performance and pricing into one comparable AI ranking. Rankings refresh continuously as new benchmark results land.

Question 2

What is the best AI model right now?

Accepted Answer

"Best" depends on what you're optimizing for. For frontier reasoning, Claude Mythos Preview leads on GPQA. For coding agents, GPT-5.5 is the strongest in head-to-head coding-arena play. For low cost at frontier quality, Qwen3.7 Max is the cheapest in the top 10 at $1.25 /M tok. The leaders summary above the table names the current winner per axis.

Question 3

What are the best LLMs in 2026?

Accepted Answer

The leading LLMs in 2026 are Claude Mythos Preview, GPT-5.5, and the frontier models from OpenAI (GPT-5 family), Anthropic (Claude Opus and Sonnet), Google (Gemini 3 Pro), xAI (Grok 4), DeepSeek (V3 / R1) and Z.AI (GLM-5). Open-weights leaders include Llama, Qwen and DeepSeek. The full ranking is in the leaderboard table above.

Question 4

What is the cheapest AI model in the top 10?

Accepted Answer

Qwen3.7 Max is the cheapest model in the top 10 by GPQA Diamond, at $1.25 /M tok input. The Cheapest filter on the leaderboard restricts to verified, currently-available frontier models — pricing is pulled from each provider's public price list and cross-checked against billing samples through the LLM Stats proxy.

Question 5

Which AI model has the largest context window?

Accepted Answer

Grok 4 Fast currently exposes the largest practical context window at 2.0M tokens tokens. Larger context lets you keep more documents, conversation history and tool traces in a single request. For long-document workloads, also consult the per-model "effective context" notes on each model detail page — providers vary in how well they actually use the upper end of their advertised windows.

Question 6

What is the fastest LLM by output speed?

Accepted Answer

GPT-4o currently has the highest output throughput at 665 tok/s. Output speed is measured by routing standardized prompts through each provider's API and averaging tokens-per-second over a 7-day rolling window. Fast inference matters most for streaming chat UIs and agentic loops; for batched async workloads, blended price per 1M tokens is usually the better axis.

Question 7

Which is the best open-source AI model?

Accepted Answer

GLM-5.2 currently leads among open-weights LLMs (91.2% gpqa). The open-weights ecosystem is dominated by Llama, Qwen, DeepSeek, Mistral, GLM and Gemma. The dedicated Open LLM Leaderboard filters this catalog to models with publicly released weights so you can self-host or fine-tune.

Question 8

How is the LLM Stats Score calculated?

Accepted Answer

The LLM Stats Score is a composite that blends verified benchmark results (GPQA Diamond, SWE-Bench Verified, coding-arena), live performance metrics (output throughput, time-to-first-token) and per-token pricing into one comparable number. Pricing and metadata revalidate hourly; live performance updates on a 7-day rolling average. For the full weighting and refresh cadence see the LLM Stats Score methodology. 319 canonical models are tracked across every major lab and inference provider.

										License
1	Claude Mythos PreviewUNRELEASED Anthropic	65.1	70.2	56.0	46.4	—	—	—	—	Proprietary
2	GPT-5.2 Pro OpenAI	63.2	55.0	—	28.4	—	—	—	—	Proprietary
3	Claude Opus 4.8 Anthropic	61.3	63.1	52.3	44.0	1,567	1.0M	52c/s	$7.22	Proprietary
4	GLM-5.2NEW Zhipu AI	59.0	59.0	49.4	39.9	1,427	1.0M	8c/s	$1.73	Open Source
5	Claude Fable 5UNRELEASED Anthropic	58.5	66.3	58.9	50.2	1,899	—	21c/s	—	Proprietary
6	GPT-5.4 OpenAI	57.0	54.1	42.5	36.2	1,740	1.0M	207c/s	$3.89	Proprietary
7	GPT-5.5 OpenAI	56.1	58.8	49.8	40.1	2,123	1.1M	39c/s	$7.78	Proprietary
8	GPT-5.1 OpenAI	55.8	47.1	30.5	—	1,250	400K	302c/s	$2.22	Proprietary
9	Seed 2.0 Pro ByteDance	55.5	52.3	31.9	28.0	—	—	—	—	Proprietary
10	Qwen3.7 Max Alibaba Cloud / Qwen Team	54.4	57.5	47.1	37.5	1,682	1.0M	27c/s	$1.53	Proprietary
11	GPT-5.1 Thinking OpenAI	54.1	45.5	29.6	—	1,087	—	—	—	Proprietary
12	GPT-5.1 Instant OpenAI	54.0	47.0	30.0	—	991	400K	404c/s	$2.22	Proprietary
13	Claude Opus 4.7 Anthropic	53.7	57.6	48.4	40.8	1,926	1.0M	41c/s	$7.22	Proprietary
14	GPT-5 Medium OpenAI	53.7	43.2	—	—	1,101	—	—	—	Proprietary
15	Gemini 3.1 Pro Google	53.2	55.7	42.3	32.0	2,111	1.0M	57c/s	$3.89	Proprietary

AI Leaderboard — Compare 300+ Top AI Models by Intelligence, Speed & Price

New Models

Performance Index

FAQ