InfoVQA

Name: InfoVQA Leaderboard — AI Model Scores
Creator: LLM Stats
License: https://llm-stats.com/legal/terms-of-service

Paper

Progress Over Time

Interactive timeline showing model performance evolution on InfoVQA

State-of-the-art frontier

Open

Proprietary

InfoVQA Leaderboard

9 models

			Context	Cost
1	Qwen2.5 VL 32B Instruct Alibaba Cloud / Qwen Team	34B	—	—
2	Qwen2.5 VL 7B Instruct Alibaba Cloud / Qwen Team	8B	—	—
3	DeepSeek VL2 DeepSeek	27B	—	—
4	DeepSeek VL2 Small DeepSeek	16B	—	—
5	Phi-4-multimodal-instruct Microsoft	6B	—	—
6	Gemma 3 27B Google	27B	—	—
7	DeepSeek VL2 Tiny DeepSeek	3B	—	—
8	Gemma 3 12B Google	12B	—	—
9	Gemma 3 4B Google	4B	—	—

Notice missing or incorrect data?

About this benchmark

What is InfoVQA?

InfoVQA dataset with 30,000 questions and 5,000 infographic images requiring joint reasoning over document layout, textual content, graphical elements, and data visualizations with elementary reasoning and arithmetic skills

InfoVQA is a multimodal benchmark evaluating models on multimodal and vision tasks. LLM Stats tracks 9 models on this benchmark, scored on a 0–1 scale. The current average is 0.7, with the leader at 0.8.

Compare leaders on the best AI for multimodal and best AI for vision leaderboards.

Current leaders

Qwen2.5 VL 32B Instruct from Alibaba Cloud / Qwen Team currently leads the InfoVQA leaderboard with a score of 0.834 across 9 evaluated AI models.

Qwen2.5 VL 32B InstructAlibaba Cloud / Qwen Team83.4%

Qwen2.5 VL 7B InstructAlibaba Cloud / Qwen Team82.6%

DeepSeek VL2DeepSeek78.1%

Source paper

Title: InfographicVQA
Authors: Minesh Mathew, Viraj Bagal, Rubèn Pérez Tito, Dimosthenis Karatzas, and 2 others
Published: April 26, 2021
arXiv: 2104.12756

Abstract

Infographics are documents designed to effectively communicate information using a combination of textual, graphical and visual elements. In this work, we explore the automatic understanding of infographic images by using Visual Question Answering technique.To this end, we present InfographicVQA, a new dataset that comprises a diverse collection of infographics along with natural language questions and answers annotations. The collected questions require methods to jointly reason over the document layout, textual content, graphical elements, and data visualizations. We curate the dataset with emphasis on questions that require elementary reasoning and basic arithmetic skills. Finally, we evaluate two strong baselines based on state of the art multi-modal VQA models, and establish baseline performance for the new task. The dataset, code and leaderboard will be made available at http://docvqa.org

FAQ

Common questions about the InfoVQA benchmark and leaderboard.

What is the InfoVQA benchmark?

What is the InfoVQA leaderboard?

The InfoVQA leaderboard ranks 9 AI models based on their performance on this benchmark. Currently, Qwen2.5 VL 32B Instruct by Alibaba Cloud / Qwen Team leads with a score of 0.834. The average score across all models is 0.716.

What is the highest InfoVQA score?

The highest InfoVQA score is 0.834, achieved by Qwen2.5 VL 32B Instruct from Alibaba Cloud / Qwen Team.

How many models are evaluated on InfoVQA?

9 models have been evaluated on the InfoVQA benchmark, with 0 verified results and 9 self-reported results.

Where can I find the InfoVQA paper?

The InfoVQA paper is available at https://arxiv.org/abs/2104.12756. The paper details the methodology, dataset construction, and evaluation criteria.

What categories does InfoVQA cover?

InfoVQA is categorized under multimodal and vision. The benchmark evaluates multimodal models.

What is the best open-source model on InfoVQA?

Qwen2.5 VL 32B Instruct by Alibaba Cloud / Qwen Team is the top-ranked open-source model on InfoVQA, with a score of 0.834 (rank #1).

How recent are the InfoVQA leaderboard results?

The InfoVQA leaderboard was last updated in July 2026 and currently includes 9 evaluated models.