Agent Startup Bench
Progress Over Time
Interactive timeline showing model performance evolution on Agent Startup Bench
Agent Startup Bench Leaderboard
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
| 1 | Seed 2.1 ProNew ByteDance | — | — | — | ||
| 2 | ByteDance | — | — | — |
What is Agent Startup Bench?
Agent Startup Bench measures AI agents on high-economic-value, startup-style tasks that require autonomous planning and execution to deliver practical, verifiable results.
Agent Startup Bench is a text benchmark evaluating models on reasoning, general, and agents tasks. LLM Stats tracks 2 models on this benchmark, scored on a 0–1 scale. The current average is 0.6, with the leader at 0.7.
Compare leaders on the best AI for reasoning, best AI for general and best AI for agents leaderboards.
Current leaders
Seed 2.1 Pro from ByteDance currently leads the Agent Startup Bench leaderboard with a score of 0.688 across 2 evaluated AI models.
FAQ
Common questions about the Agent Startup Bench benchmark and leaderboard.