CreativeWork
Progress Over Time
Interactive timeline showing model performance evolution on CreativeWork
CreativeWork Leaderboard
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
| 1 | Seed 2.1 ProNew ByteDance | — | — | — | ||
| 2 | ByteDance | — | — | — |
What is CreativeWork?
CreativeWork evaluates agents on open-ended creative production tasks within realistic tool and application environments, measuring the quality and completeness of generated deliverables.
CreativeWork is a multimodal benchmark evaluating models on reasoning, general, and agents tasks. LLM Stats tracks 2 models on this benchmark, scored on a 0–1 scale. The current average is 0.4, with the leader at 0.4.
Compare leaders on the best AI for reasoning, best AI for general and best AI for agents leaderboards.
Current leaders
Seed 2.1 Pro from ByteDance currently leads the CreativeWork leaderboard with a score of 0.425 across 2 evaluated AI models.
FAQ
Common questions about the CreativeWork benchmark and leaderboard.