LMArena Text Leaderboard

PaperImplementation

Progress Over Time

Interactive timeline showing model performance evolution on LMArena Text Leaderboard

State-of-the-art frontier
Open
Proprietary

LMArena Text Leaderboard Leaderboard

2 models
ContextCostLicense
1
2
Notice missing or incorrect data?
About this benchmark

What is LMArena Text Leaderboard?

LMArena Text Leaderboard is a blind human preference evaluation benchmark that ranks models based on pairwise comparisons in real-world conversations. The leaderboard uses Elo ratings computed from user preferences in head-to-head model battles, providing a comprehensive measure of overall model capability and style.

LMArena Text Leaderboard is a text benchmark evaluating models on reasoning and general tasks. LLM Stats tracks 2 models on this benchmark, scored on a 0–2000 scale. The current average is 1474.0, with the leader at 1483.0.

Compare leaders on the best AI for reasoning and best AI for general leaderboards.

Current leaders

Grok-4.1 Thinking from xAI currently leads the LMArena Text Leaderboard leaderboard with a score of 1483.000 across 2 evaluated AI models.

1Grok-4.1 ThinkingxAI1483.000
2Grok-4.1xAI1465.000

FAQ

Common questions about the LMArena Text Leaderboard benchmark and leaderboard.

What is the LMArena Text Leaderboard benchmark?

LMArena Text Leaderboard is a blind human preference evaluation benchmark that ranks models based on pairwise comparisons in real-world conversations. The leaderboard uses Elo ratings computed from user preferences in head-to-head model battles, providing a comprehensive measure of overall model capability and style.

What is the LMArena Text Leaderboard leaderboard?

The LMArena Text Leaderboard leaderboard ranks 2 AI models based on their performance on this benchmark. Currently, Grok-4.1 Thinking by xAI leads with a score of 1483.000. The average score across all models is 1474.000.

What is the highest LMArena Text Leaderboard score?

The highest LMArena Text Leaderboard score is 1483.000, achieved by Grok-4.1 Thinking from xAI.

How many models are evaluated on LMArena Text Leaderboard?

2 models have been evaluated on the LMArena Text Leaderboard benchmark, with 0 verified results and 2 self-reported results.

Where can I find the LMArena Text Leaderboard paper?

The LMArena Text Leaderboard paper is available at https://arena.lmsys.org/. The paper details the methodology, dataset construction, and evaluation criteria.

Where can I find the LMArena Text Leaderboard dataset?

The LMArena Text Leaderboard dataset is available at https://arena.lmsys.org/.

What categories does LMArena Text Leaderboard cover?

LMArena Text Leaderboard is categorized under reasoning and general. The benchmark evaluates text models.

How recent are the LMArena Text Leaderboard leaderboard results?

The LMArena Text Leaderboard leaderboard was last updated in July 2026 and currently includes 2 evaluated models.