EQ-Bench
Progress Over Time
Interactive timeline showing model performance evolution on EQ-Bench
EQ-Bench Leaderboard
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
| 1 | — | — | — | |||
| 2 | xAI | — | — | — |
What is EQ-Bench?
EQ-Bench is an LLM-judged test evaluating active emotional intelligence abilities, understanding, insight, empathy, and interpersonal skills. The test set contains 45 challenging roleplay scenarios, most of which constitute pre-written prompts spanning 3 turns. The benchmark evaluates the performance of models by validating responses against several criteria and conducts pairwise comparisons to report a normalized Elo computation for each model.
EQ-Bench is a text benchmark evaluating models on reasoning, roleplay, general, creativity, and writing tasks. LLM Stats tracks 2 models on this benchmark, scored on a 0–2000 scale. The current average is 1585.5, with the leader at 1586.0.
Compare leaders on the best AI for reasoning, best AI for roleplay, best AI for general, best AI for creativity and best AI for writing leaderboards.
Current leaders
Grok-4.1 Thinking from xAI currently leads the EQ-Bench leaderboard with a score of 1586.000 across 2 evaluated AI models.
FAQ
Common questions about the EQ-Bench benchmark and leaderboard.