EMMA

Name: EMMA Leaderboard — AI Model Scores
Creator: LLM Stats
License: https://llm-stats.com/legal/terms-of-service

Progress Over Time

Interactive timeline showing model performance evolution on EMMA

State-of-the-art frontier

Open

Proprietary

EMMA Leaderboard

2 models

				Context	Cost	License
1	Seed 2.1 ProNew ByteDance		—	—	—
2	Seed 2.1 TurboNew ByteDance		—	—	—

Notice missing or incorrect data?

About this benchmark

What is EMMA?

EMMA (Enhanced MultiModal reAsoning) is a benchmark for organic multimodal reasoning across mathematics, physics, chemistry, and coding.

EMMA is a multimodal benchmark evaluating models on math, multimodal, reasoning, and vision tasks. LLM Stats tracks 2 models on this benchmark, scored on a 0–1 scale. The current average is 0.8, with the leader at 0.8.

Compare leaders on the best AI for math, best AI for multimodal, best AI for reasoning and best AI for vision leaderboards.

Current leaders

Seed 2.1 Pro from ByteDance currently leads the EMMA leaderboard with a score of 0.793 across 2 evaluated AI models.

Seed 2.1 ProByteDance79.3%

Seed 2.1 TurboByteDance78.4%

FAQ

Common questions about the EMMA benchmark and leaderboard.

What is the EMMA benchmark?

EMMA (Enhanced MultiModal reAsoning) is a benchmark for organic multimodal reasoning across mathematics, physics, chemistry, and coding.

What is the EMMA leaderboard?

The EMMA leaderboard ranks 2 AI models based on their performance on this benchmark. Currently, Seed 2.1 Pro by ByteDance leads with a score of 0.793. The average score across all models is 0.788.

What is the highest EMMA score?

The highest EMMA score is 0.793, achieved by Seed 2.1 Pro from ByteDance.

How many models are evaluated on EMMA?

2 models have been evaluated on the EMMA benchmark, with 0 verified results and 2 self-reported results.

What categories does EMMA cover?

EMMA is categorized under math, multimodal, reasoning, and vision. The benchmark evaluates multimodal models.

How recent are the EMMA leaderboard results?

The EMMA leaderboard was last updated in June 2026 and currently includes 2 evaluated models.