What is the VisualWebBench leaderboard?

The VisualWebBench leaderboard ranks 2 AI models based on their performance on this benchmark. Currently, Nova Pro by Amazon leads with a score of 0.797. The average score across all models is 0.787.

What is the highest VisualWebBench score?

The highest VisualWebBench score is 0.797, achieved by Nova Pro from Amazon.

How many models are evaluated on VisualWebBench?

2 models have been evaluated on the VisualWebBench benchmark, with 0 verified results and 2 self-reported results.

Where can I find the VisualWebBench paper?

The VisualWebBench paper is available at https://arxiv.org/abs/2404.05955. The paper details the methodology, dataset construction, and evaluation criteria.

What categories does VisualWebBench cover?

VisualWebBench is categorized under vision, frontend development, and multimodal. The benchmark evaluates multimodal models.

All benchmarks

VisualWebBench

A multimodal benchmark designed to assess the capabilities of multimodal large language models (MLLMs) across web page understanding and grounding tasks. Comprises 7 tasks (captioning, webpage QA, heading OCR, element OCR, element grounding, action prediction, and action grounding) with 1.5K human-curated instances from 139 real websites across 87 sub-domains.

Nova Pro from Amazon currently leads the VisualWebBench leaderboard with a score of 0.797 across 2 evaluated AI models.

Paper