What is the MRCR leaderboard?

The MRCR leaderboard ranks 7 AI models based on their performance on this benchmark. Currently, Gemini 2.5 Pro by Google leads with a score of 0.930. The average score across all models is 0.642.

What is the highest MRCR score?

The highest MRCR score is 0.930, achieved by Gemini 2.5 Pro from Google.

How many models are evaluated on MRCR?

7 models have been evaluated on the MRCR benchmark, with 0 verified results and 7 self-reported results.

Where can I find the MRCR paper?

The MRCR paper is available at https://arxiv.org/abs/2409.12640. The paper details the methodology, dataset construction, and evaluation criteria.

What categories does MRCR cover?

MRCR is categorized under general, long context, and reasoning. The benchmark evaluates text models.

Are there variants of MRCR?

Yes. MRCR has 6 related variants: MRCR 128K (2-needle), MRCR 128K (4-needle), MRCR 128K (8-needle), MRCR 64K (2-needle).

What is the best open-source model on MRCR?

MiMo-V2-Flash by Xiaomi is the top-ranked open-source model on MRCR, with a score of 0.457 (rank #6).

Which model offers the best value on MRCR?

Among models scoring within 10% of the leader, Gemini 2.5 Pro from Google is the cheapest, at $1.25 per million input tokens with a score of 0.930.

How recent are the MRCR leaderboard results?

The MRCR leaderboard was last updated in May 2026 and currently includes 7 evaluated models.

All benchmarks

MRCR

MRCR (Multi-Round Coreference Resolution) is a synthetic long-context reasoning task where models must navigate long conversations to reproduce specific model outputs. It tests the ability to distinguish between similar requests and reason about ordering while maintaining attention across extended contexts.

Gemini 2.5 Pro from Google currently leads the MRCR leaderboard with a score of 0.930 across 7 evaluated AI models.

Paper