Workspace Bench
Progress Over Time
Interactive timeline showing model performance evolution on Workspace Bench
Workspace Bench Leaderboard
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
| 1 | ByteDance | — | — | — | ||
| 2 | Seed 2.1 ProNew ByteDance | — | — | — |
What is Workspace Bench?
Workspace Bench evaluates AI agents on high-economic-value workplace tasks that span multi-step planning, file processing, and tool use across realistic office and productivity workflows.
Workspace Bench is a text benchmark evaluating models on reasoning, general, and agents tasks. LLM Stats tracks 2 models on this benchmark, scored on a 0–1 scale. The current average is 0.5, with the leader at 0.5.
Compare leaders on the best AI for reasoning, best AI for general and best AI for agents leaderboards.
Current leaders
Seed 2.1 Turbo from ByteDance currently leads the Workspace Bench leaderboard with a score of 0.547 across 2 evaluated AI models.
FAQ
Common questions about the Workspace Bench benchmark and leaderboard.