GDP.pdf

Name: GDP.pdf Leaderboard — AI Model Scores
Creator: LLM Stats
License: https://llm-stats.com/legal/terms-of-service

Progress Over Time

Interactive timeline showing model performance evolution on GDP.pdf

State-of-the-art frontier

Open

Proprietary

GDP.pdf Leaderboard

5 models

			Context	Cost
1	Claude Sonnet 5 Anthropic	—	1.0M	$3.00 / $15.00
2	GPT-5.6 Sol OpenAI	—	1.1M	$5.00 / $30.00
3	Claude Fable 5 Anthropic	—	1.0M	$10.00 / $50.00
4	GPT-5.6 Terra OpenAI	—	1.1M	$2.50 / $15.00
5	GPT-5.6 Luna OpenAI	—	1.1M	$1.00 / $6.00

Notice missing or incorrect data?

About this benchmark

What is GDP.pdf?

GDP.pdf is a knowledge-work vision benchmark that evaluates models on economically valuable professional tasks presented as visual documents (PDFs), testing document-based reasoning, chart and table interpretation, and problem solving without tools.

GDP.pdf is a multimodal benchmark evaluating models on multimodal, reasoning, general, and vision tasks. LLM Stats tracks 5 models on this benchmark, scored on a 0–1 scale. The current average is 0.4, with the leader at 0.8.

Compare leaders on the best AI for multimodal, best AI for reasoning, best AI for general and best AI for vision leaderboards.

Current leaders

Claude Sonnet 5 from Anthropic currently leads the GDP.pdf leaderboard with a score of 0.816 across 5 evaluated AI models.

Claude Sonnet 5Anthropic81.6%

GPT-5.6 SolOpenAI30.7%

Claude Fable 5Anthropic29.8%

FAQ

Common questions about the GDP.pdf benchmark and leaderboard.

What is the GDP.pdf benchmark?

What is the GDP.pdf leaderboard?

The GDP.pdf leaderboard ranks 5 AI models based on their performance on this benchmark. Currently, Claude Sonnet 5 by Anthropic leads with a score of 0.816. The average score across all models is 0.379.

What is the highest GDP.pdf score?

The highest GDP.pdf score is 0.816, achieved by Claude Sonnet 5 from Anthropic.

How many models are evaluated on GDP.pdf?

5 models have been evaluated on the GDP.pdf benchmark, with 0 verified results and 5 self-reported results.

What categories does GDP.pdf cover?

GDP.pdf is categorized under multimodal, reasoning, general, and vision. The benchmark evaluates multimodal models.

Which model offers the best value on GDP.pdf?

Among models scoring within 10% of the leader, Claude Sonnet 5 from Anthropic is the cheapest, at $3.00 per million input tokens with a score of 0.816.

How recent are the GDP.pdf leaderboard results?

The GDP.pdf leaderboard was last updated in July 2026 and currently includes 5 evaluated models.