InfoVQA

Paper

Progress Over Time

Interactive timeline showing model performance evolution on InfoVQA

State-of-the-art frontier
Open
Proprietary

InfoVQA Leaderboard

9 models
ContextCostLicense
1
Alibaba Cloud / Qwen Team
Alibaba Cloud / Qwen Team
34B
2
Alibaba Cloud / Qwen Team
Alibaba Cloud / Qwen Team
8B
3
DeepSeek
DeepSeek
27B
416B
56B
627B
73B
812B
94B
Notice missing or incorrect data?
About this benchmark

What is InfoVQA?

InfoVQA dataset with 30,000 questions and 5,000 infographic images requiring joint reasoning over document layout, textual content, graphical elements, and data visualizations with elementary reasoning and arithmetic skills

InfoVQA is a multimodal benchmark evaluating models on multimodal and vision tasks. LLM Stats tracks 9 models on this benchmark, scored on a 0–1 scale. The current average is 0.7, with the leader at 0.8.

Compare leaders on the best AI for multimodal and best AI for vision leaderboards.

Current leaders

Qwen2.5 VL 32B Instruct from Alibaba Cloud / Qwen Team currently leads the InfoVQA leaderboard with a score of 0.834 across 9 evaluated AI models.

1Qwen2.5 VL 32B InstructAlibaba Cloud / Qwen Team83.4%
2Qwen2.5 VL 7B InstructAlibaba Cloud / Qwen Team82.6%
3DeepSeek VL2DeepSeek78.1%

Source paper

Title
InfographicVQA
Authors
Minesh Mathew, Viraj Bagal, Rubèn Pérez Tito, Dimosthenis Karatzas, and 2 others
Published
Abstract

Infographics are documents designed to effectively communicate information using a combination of textual, graphical and visual elements. In this work, we explore the automatic understanding of infographic images by using Visual Question Answering technique.To this end, we present InfographicVQA, a new dataset that comprises a diverse collection of infographics along with natural language questions and answers annotations. The collected questions require methods to jointly reason over the document layout, textual content, graphical elements, and data visualizations. We curate the dataset with emphasis on questions that require elementary reasoning and basic arithmetic skills. Finally, we evaluate two strong baselines based on state of the art multi-modal VQA models, and establish baseline performance for the new task. The dataset, code and leaderboard will be made available at http://docvqa.org

FAQ

Common questions about the InfoVQA benchmark and leaderboard.

What is the InfoVQA benchmark?

InfoVQA dataset with 30,000 questions and 5,000 infographic images requiring joint reasoning over document layout, textual content, graphical elements, and data visualizations with elementary reasoning and arithmetic skills

What is the InfoVQA leaderboard?

The InfoVQA leaderboard ranks 9 AI models based on their performance on this benchmark. Currently, Qwen2.5 VL 32B Instruct by Alibaba Cloud / Qwen Team leads with a score of 0.834. The average score across all models is 0.716.

What is the highest InfoVQA score?

The highest InfoVQA score is 0.834, achieved by Qwen2.5 VL 32B Instruct from Alibaba Cloud / Qwen Team.

How many models are evaluated on InfoVQA?

9 models have been evaluated on the InfoVQA benchmark, with 0 verified results and 9 self-reported results.

Where can I find the InfoVQA paper?

The InfoVQA paper is available at https://arxiv.org/abs/2104.12756. The paper details the methodology, dataset construction, and evaluation criteria.

What categories does InfoVQA cover?

InfoVQA is categorized under multimodal and vision. The benchmark evaluates multimodal models.

What is the best open-source model on InfoVQA?

Qwen2.5 VL 32B Instruct by Alibaba Cloud / Qwen Team is the top-ranked open-source model on InfoVQA, with a score of 0.834 (rank #1).

How recent are the InfoVQA leaderboard results?

The InfoVQA leaderboard was last updated in July 2026 and currently includes 9 evaluated models.