CC-OCR

Name: CC-OCR Leaderboard — AI Model Scores
Creator: LLM Stats
License: https://llm-stats.com/legal/terms-of-service

Paper Implementation

Progress Over Time

Interactive timeline showing model performance evolution on CC-OCR

State-of-the-art frontier

Open

Proprietary

CC-OCR Leaderboard

18 models

			Context	Cost
1	Qwen3.6 Plus Alibaba Cloud / Qwen Team	—	1.0M	$0.50 / $3.00
2	Qwen3 VL 235B A22B Instruct Alibaba Cloud / Qwen Team	236B	—	—
3	Qwen3.6-35B-A3B Alibaba Cloud / Qwen Team	35B	—	—
4	Qwen3.5-122B-A10B Alibaba Cloud / Qwen Team	122B	—	—
5	Qwen3 VL 235B A22B Thinking Alibaba Cloud / Qwen Team	236B	—	—
6	Qwen3.6-27B Alibaba Cloud / Qwen Team	28B	262K	$0.60 / $3.60
7	Qwen3.5-27B Alibaba Cloud / Qwen Team	27B	262K	$0.30 / $2.40
8	Qwen3.5-35B-A3B Alibaba Cloud / Qwen Team	35B	—	—
8	Qwen3 VL 30B A3B Instruct Alibaba Cloud / Qwen Team	31B	—	—
10	Qwen3 VL 32B Instruct Alibaba Cloud / Qwen Team	33B	—	—
11	Qwen3 VL 8B Instruct Alibaba Cloud / Qwen Team	9B	—	—
12	Qwen2.5 VL 72B Instruct Alibaba Cloud / Qwen Team	72B	—	—
13	Qwen3 VL 30B A3B Thinking Alibaba Cloud / Qwen Team	31B	—	—
13	Qwen2.5 VL 7B Instruct Alibaba Cloud / Qwen Team	8B	—	—
15	Qwen2.5 VL 32B Instruct Alibaba Cloud / Qwen Team	34B	—	—
16	Qwen3 VL 8B Thinking Alibaba Cloud / Qwen Team	9B	262K	$0.18 / $2.09
17	Qwen3 VL 4B Instruct Alibaba Cloud / Qwen Team	4B	262K	$0.10 / $0.60
18	Qwen3 VL 4B Thinking Alibaba Cloud / Qwen Team	4B	262K	$0.10 / $1.00

Notice missing or incorrect data?

About this benchmark

What is CC-OCR?

A comprehensive OCR benchmark for evaluating Large Multimodal Models (LMMs) in literacy. Comprises four OCR-centric tracks: multi-scene text reading, multilingual text reading, document parsing, and key information extraction. Contains 39 subsets with 7,058 fully annotated images, 41% sourced from real applications. Tests capabilities including text grounding, multi-orientation text recognition, and detecting hallucination/repetition across diverse visual challenges.

CC-OCR is a multimodal benchmark evaluating models on multimodal, structured output, text-to-image, and vision tasks. LLM Stats tracks 18 models on this benchmark, scored on a 0–1 scale. The current average is 0.8, with the leader at 0.8.

Compare leaders on the best AI for multimodal, best AI for structured output, best AI for text-to-image and best AI for vision leaderboards.

Current leaders

Qwen3.6 Plus from Alibaba Cloud / Qwen Team currently leads the CC-OCR leaderboard with a score of 0.834 across 18 evaluated AI models.

Qwen3.6 PlusAlibaba Cloud / Qwen Team83.4%

Qwen3 VL 235B A22B InstructAlibaba Cloud / Qwen Team82.2%

Qwen3.6-35B-A3BAlibaba Cloud / Qwen Team81.9%

Source paper

Title: CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy
Authors: Zhibo Yang, Jun Tang, Zhaohai Li, Pengfei Wang, and 8 others
Published: December 3, 2024
arXiv: 2412.02210

Abstract

Large Multimodal Models (LMMs) have demonstrated impressive performance in recognizing document images with natural language instructions. However, it remains unclear to what extent capabilities in literacy with rich structure and fine-grained visual challenges. The current landscape lacks a comprehensive benchmark to effectively measure the literate capabilities of LMMs. Existing benchmarks are often limited by narrow scenarios and specified tasks. To this end, we introduce CC-OCR, a comprehensive benchmark that possesses a diverse range of scenarios, tasks, and challenges. CC-OCR comprises four OCR-centric tracks: multi-scene text reading, multilingual text reading, document parsing, and key information extraction. It includes 39 subsets with 7,058 full annotated images, of which 41% are sourced from real applications, and released for the first time. We evaluate nine prominent LMMs and reveal both the strengths and weaknesses of these models, particularly in text grounding, multi-orientation, and hallucination of repetition. CC-OCR aims to comprehensively evaluate the capabilities of LMMs on OCR-centered tasks, facilitating continued progress in this crucial area.

FAQ

Common questions about the CC-OCR benchmark and leaderboard.

What is the CC-OCR benchmark?

What is the CC-OCR leaderboard?

The CC-OCR leaderboard ranks 18 AI models based on their performance on this benchmark. Currently, Qwen3.6 Plus by Alibaba Cloud / Qwen Team leads with a score of 0.834. The average score across all models is 0.796.

What is the highest CC-OCR score?

The highest CC-OCR score is 0.834, achieved by Qwen3.6 Plus from Alibaba Cloud / Qwen Team.

How many models are evaluated on CC-OCR?

18 models have been evaluated on the CC-OCR benchmark, with 0 verified results and 18 self-reported results.

Where can I find the CC-OCR paper?

The CC-OCR paper is available at https://arxiv.org/abs/2412.02210. The paper details the methodology, dataset construction, and evaluation criteria.

Where can I find the CC-OCR dataset?

The CC-OCR dataset is available at https://github.com/AlibabaResearch/AdvancedLiterateMachinery.

What categories does CC-OCR cover?

CC-OCR is categorized under multimodal, structured output, text-to-image, and vision. The benchmark evaluates multimodal models with multilingual support.

What is the best open-source model on CC-OCR?

Qwen3 VL 235B A22B Instruct by Alibaba Cloud / Qwen Team is the top-ranked open-source model on CC-OCR, with a score of 0.822 (rank #2).

Which model offers the best value on CC-OCR?

Among models scoring within 10% of the leader, Qwen3 VL 4B Instruct from Alibaba Cloud / Qwen Team is the cheapest, at $0.10 per million input tokens with a score of 0.762.

How recent are the CC-OCR leaderboard results?

The CC-OCR leaderboard was last updated in July 2026 and currently includes 18 evaluated models.