GameWorld
Progress Over Time
Interactive timeline showing model performance evolution on GameWorld
GameWorld Leaderboard
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
| 1 | Seed 2.1 ProNew ByteDance | — | — | — | ||
| 2 | ByteDance | — | — | — |
What is GameWorld?
GameWorld evaluates agents on interactive game environments, testing perception, planning, and sequential decision-making to accomplish in-game objectives.
GameWorld is a multimodal benchmark evaluating models on multimodal, reasoning, and agents tasks. LLM Stats tracks 2 models on this benchmark, scored on a 0–1 scale. The current average is 0.3, with the leader at 0.3.
Compare leaders on the best AI for multimodal, best AI for reasoning and best AI for agents leaderboards.
Current leaders
Seed 2.1 Pro from ByteDance currently leads the GameWorld leaderboard with a score of 0.312 across 2 evaluated AI models.
FAQ
Common questions about the GameWorld benchmark and leaderboard.