TreeBench

Name: TreeBench Leaderboard — AI Model Scores
Creator: LLM Stats
License: https://llm-stats.com/legal/terms-of-service

Progress Over Time

Interactive timeline showing model performance evolution on TreeBench

State-of-the-art frontier

Open

Proprietary

TreeBench Leaderboard

2 models

				Context	Cost	License
1	Seed 2.1 ProNew ByteDance		—	—	—
1	Seed 2.1 TurboNew ByteDance		—	—	—

Notice missing or incorrect data?

About this benchmark

What is TreeBench?

TreeBench evaluates visual grounded reasoning, requiring models to localize and reason about fine-grained visual details.

TreeBench is a multimodal benchmark evaluating models on multimodal, reasoning, spatial reasoning, and vision tasks. LLM Stats tracks 2 models on this benchmark, scored on a 0–1 scale. The current average is 0.7, with the leader at 0.7.

Compare leaders on the best AI for multimodal, best AI for reasoning, best AI for spatial reasoning and best AI for vision leaderboards.

Current leaders

Seed 2.1 Pro from ByteDance currently leads the TreeBench leaderboard with a score of 0.711 across 2 evaluated AI models.

Seed 2.1 ProByteDance71.1%

Seed 2.1 TurboByteDance71.1%

FAQ

Common questions about the TreeBench benchmark and leaderboard.

What is the TreeBench benchmark?

TreeBench evaluates visual grounded reasoning, requiring models to localize and reason about fine-grained visual details.

What is the TreeBench leaderboard?

The TreeBench leaderboard ranks 2 AI models based on their performance on this benchmark. Currently, Seed 2.1 Pro by ByteDance leads with a score of 0.711. The average score across all models is 0.711.

What is the highest TreeBench score?

The highest TreeBench score is 0.711, achieved by Seed 2.1 Pro from ByteDance.

How many models are evaluated on TreeBench?

2 models have been evaluated on the TreeBench benchmark, with 0 verified results and 2 self-reported results.

What categories does TreeBench cover?

TreeBench is categorized under multimodal, reasoning, spatial reasoning, and vision. The benchmark evaluates multimodal models.

How recent are the TreeBench leaderboard results?

The TreeBench leaderboard was last updated in June 2026 and currently includes 2 evaluated models.