LLM Leaderboard

Analyze and compare AI Models across benchmarks, pricing, and capabilities.

Use any AI model for free in our playground

AI Leaderboard updated daily with latest models

Tracking official AI benchmarks & pricing data from leading organizations

Loading timeline...

Notice missing or incorrect data?Start an Issue discussion→

AI Ranking

Best models and API providers in each category

LLM Benchmark Leaderboards

Best 15 models across popular benchmarks

Community

Share & discuss

Follow on X

Real-time updates

Join Discord

Get help & insights

r/AILeaderboards

Discuss benchmarks

Developer Platform

One API. Every model.

Test in our playground. Deploy with our unified API. Access 100+ models through a single OpenAI-compatible endpoint.

Free Playground Browse All Models

99.9% uptime

100+ models

Open LLM Leaderboard

Best performing open source models ranked by GPQA reasoning benchmark

Context Window

Maximum input context length for each model

While tokenization varies between models, on average, 1 token ≈ 3.5 characters in English.Note: Each model uses its own tokenizer, so actual token counts may vary significantly.

As a rough guide, 1 million tokens is approximately equivalent to:

30 hours

of a podcast

~150 words per minute

1,000 pages

of a book

~500 words per page

60,000 lines¹

of code

~60 characters per line

[1] Based on average characters per line. See Wikipedia.

LLM Comparisons

Top models ranked by capabilities with pricing visualization

API Providers - Open LLM Providers

Price and performance across providers for Llama 4 Maverick

Provider performance varies significantly. Some providers run full-precision models on specialized hardware accelerators (like Groq's LPU or Cerebras' CS-3), while others may use quantization (4-bit, 8-bit) to simulate faster speeds on commodity hardware. Check provider documentation for specific hardware and quantization details, as this can impact both speed and model quality.

QualityFP16/BF16

8-bit/4-bitSpeed

Model Quantization Trade-off

QualityFP16/BF16

Model Quantization Trade-off

8-bit/4-bitSpeed

Trends

Tracking AI progress across nations, model types, and organizations

Popular LLM Comparisons

LLM Leaderboard

AI Ranking

Most Popular Benchmark Categories

Best LLM for Research

Best LLM for Reasoning

Best LLM for Coding

Best LLM for Math

Best Multimodal LLMs

Best LLM for Long Context