GDPval-MM

GDPval-MM is the multimodal variant of the GDPval benchmark, evaluating AI model performance on real-world economically valuable tasks that require processing and generating multimodal content including documents, slides, diagrams, spreadsheets, images, and other professional deliverables across diverse industries.

Paper

Progress Over Time

Interactive timeline showing model performance evolution on GDPval-MM

State-of-the-art frontier
Open
Proprietary

GDPval-MM Leaderboard

1 models • 0 verified
ContextCostLicense
1
230B1.0M
$0.30
$1.20
Notice missing or incorrect data?

FAQ

Common questions about GDPval-MM

GDPval-MM is the multimodal variant of the GDPval benchmark, evaluating AI model performance on real-world economically valuable tasks that require processing and generating multimodal content including documents, slides, diagrams, spreadsheets, images, and other professional deliverables across diverse industries.
The GDPval-MM paper is available at https://arxiv.org/abs/2510.04374. This paper provides detailed information about the benchmark methodology, dataset creation, and evaluation criteria.
The GDPval-MM leaderboard ranks 1 AI models based on their performance on this benchmark. Currently, MiniMax M2.5 by MiniMax leads with a score of 0.590. The average score across all models is 0.590.
The highest GDPval-MM score is 0.590, achieved by MiniMax M2.5 from MiniMax.
1 models have been evaluated on the GDPval-MM benchmark, with 0 verified results and 1 self-reported results.
GDPval-MM is categorized under general, multimodal, and reasoning. The benchmark evaluates multimodal models.