MRCR 128K (8-needle) Leaderboard

Progress Over Time

Interactive timeline showing model performance evolution on MRCR 128K (8-needle)

State-of-the-art frontier

Open

Proprietary

MRCR 128K (8-needle) Leaderboard

1 models

				Context	Cost	License
1	MiniCPM-SALA OpenBMB		9B	—	—

FAQ

Common questions about MRCR 128K (8-needle)

MRCR (Multi-Round Coreference Resolution) at 128K context length with 8 needles. Models must navigate long conversations to reproduce specific model outputs, testing attention and reasoning across 128K-token contexts with 8 items to retrieve.

The MRCR 128K (8-needle) paper is available at https://arxiv.org/abs/2409.12640. This paper provides detailed information about the benchmark methodology, dataset creation, and evaluation criteria.

The MRCR 128K (8-needle) leaderboard ranks 1 AI models based on their performance on this benchmark. Currently, MiniCPM-SALA by OpenBMB leads with a score of 0.101. The average score across all models is 0.101.

The highest MRCR 128K (8-needle) score is 0.101, achieved by MiniCPM-SALA from OpenBMB.

1 models have been evaluated on the MRCR 128K (8-needle) benchmark, with 0 verified results and 1 self-reported results.

MRCR 128K (8-needle) is categorized under general, long context, and reasoning. The benchmark evaluates text models.

MRCR 128K (8-needle)

Progress Over Time

MRCR 128K (8-needle) Leaderboard

FAQ

What is the MRCR 128K (8-needle) benchmark?

Where can I find the MRCR 128K (8-needle) paper?

What is the MRCR 128K (8-needle) leaderboard?

What is the highest MRCR 128K (8-needle) score?

How many models are evaluated on MRCR 128K (8-needle)?

What categories does MRCR 128K (8-needle) cover?