Web Bench
Progress Over Time
Interactive timeline showing model performance evolution on Web Bench
State-of-the-art frontier
Open
Proprietary
Web Bench Leaderboard
2 models
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
| 1 | Seed 2.1 ProNew ByteDance | — | — | — | ||
| 2 | ByteDance | — | — | — |
Notice missing or incorrect data?
What is Web Bench?
Web Bench evaluates agents on realistic web-development engineering tasks, measuring end-to-end implementation in browser-based workflows.
Web Bench is a text benchmark evaluating models on agents and coding tasks. LLM Stats tracks 2 models on this benchmark, scored on a 0–1 scale. The current average is 0.8, with the leader at 0.8.
Compare leaders on the best AI for agents and best AI for coding leaderboards.
Current leaders
Seed 2.1 Pro from ByteDance currently leads the Web Bench leaderboard with a score of 0.784 across 2 evaluated AI models.
FAQ
Common questions about the Web Bench benchmark and leaderboard.