CreativeWork

Progress Over Time

Interactive timeline showing model performance evolution on CreativeWork

State-of-the-art frontier
Open
Proprietary

CreativeWork Leaderboard

2 models
ContextCostLicense
1
ByteDance
ByteDance
2
ByteDance
ByteDance
Notice missing or incorrect data?
About this benchmark

What is CreativeWork?

CreativeWork evaluates agents on open-ended creative production tasks within realistic tool and application environments, measuring the quality and completeness of generated deliverables.

CreativeWork is a multimodal benchmark evaluating models on reasoning, general, and agents tasks. LLM Stats tracks 2 models on this benchmark, scored on a 0–1 scale. The current average is 0.4, with the leader at 0.4.

Compare leaders on the best AI for reasoning, best AI for general and best AI for agents leaderboards.

Current leaders

Seed 2.1 Pro from ByteDance currently leads the CreativeWork leaderboard with a score of 0.425 across 2 evaluated AI models.

1Seed 2.1 ProByteDance42.5%
2Seed 2.1 TurboByteDance34.5%

FAQ

Common questions about the CreativeWork benchmark and leaderboard.

What is the CreativeWork benchmark?

CreativeWork evaluates agents on open-ended creative production tasks within realistic tool and application environments, measuring the quality and completeness of generated deliverables.

What is the CreativeWork leaderboard?

The CreativeWork leaderboard ranks 2 AI models based on their performance on this benchmark. Currently, Seed 2.1 Pro by ByteDance leads with a score of 0.425. The average score across all models is 0.385.

What is the highest CreativeWork score?

The highest CreativeWork score is 0.425, achieved by Seed 2.1 Pro from ByteDance.

How many models are evaluated on CreativeWork?

2 models have been evaluated on the CreativeWork benchmark, with 0 verified results and 2 self-reported results.

What categories does CreativeWork cover?

CreativeWork is categorized under reasoning, general, and agents. The benchmark evaluates multimodal models.

How recent are the CreativeWork leaderboard results?

The CreativeWork leaderboard was last updated in June 2026 and currently includes 2 evaluated models.