MRCR
MRCR (Multi-Round Coreference Resolution) is a synthetic long-context reasoning task where models must navigate long conversations to reproduce specific model outputs. It tests the ability to distinguish between similar requests and reason about ordering while maintaining attention across extended contexts.
Gemini 2.5 Pro from Google currently leads the MRCR leaderboard with a score of 0.930 across 7 evaluated AI models.