MCP Atlas
Progress Over Time
Interactive timeline showing model performance evolution on MCP Atlas
MCP Atlas Leaderboard
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
| 1 | ByteDance | — | — | — | ||
| 2 | Google | — | 1.0M | $1.50 / $9.00 | ||
| 3 | Anthropic | — | 1.0M | $5.00 / $25.00 | ||
| 4 | ByteDance | — | — | — | ||
| 5 | Anthropic | — | 1.0M | $5.00 / $25.00 | ||
| 6 | Zhipu AI | 753B | 1.0M | $0.95 / $3.00 | ||
| 7 | Alibaba Cloud / Qwen Team | — | 1.0M | $1.25 / $3.75 | ||
| 8 | Moonshot AI | 1.0T | 262K | $0.74 / $3.50 | ||
| 9 | OpenAI | — | 1.1M | $5.00 / $30.00 | ||
| 10 | MiniMax | — | 1.0M | $0.30 / $1.20 | ||
| 11 | Alibaba Cloud / Qwen Team | — | 1.0M | $0.50 / $3.00 | ||
| 12 | DeepSeek | 1.6T | 1.0M | $1.60 / $3.20 | ||
| 13 | Alibaba Cloud / Qwen Team | — | 1.0M | $0.32 / $1.28 | ||
| 14 | Zhipu AI | 754B | 200K | $1.40 / $4.40 | ||
| 15 | Google | — | 1.0M | $2.50 / $15.00 | ||
| 16 | DeepSeek | 284B | 1.0M | $0.10 / $0.20 | ||
| 17 | Zhipu AI | 744B | 200K | $1.00 / $3.20 | ||
| 18 | OpenAI | — | 1.0M | $2.50 / $15.00 | ||
| 19 | Alibaba Cloud / Qwen Team | 35B | — | — | ||
| 20 | Anthropic | — | 1.0M | $5.00 / $25.00 | ||
| 21 | Anthropic | — | — | — | ||
| 22 | Anthropic | — | 200K | $3.00 / $15.00 | ||
| 23 | OpenAI | — | 400K | $1.75 / $14.00 | ||
| 24 | OpenAI | — | 400K | $0.75 / $4.50 | ||
| 25 | Google | — | 1.0M | $0.50 / $3.00 | ||
| 26 | OpenAI | — | 400K | $0.20 / $1.25 | ||
| 27 | Amazon | — | 1.0M | $0.30 / $2.50 |
What is MCP Atlas?
MCP Atlas is a benchmark for evaluating AI models on scaled tool use capabilities, measuring how well models can coordinate and utilize multiple tools across complex multi-step tasks.
MCP Atlas is a text benchmark evaluating models on reasoning, agents, code, and tool calling tasks. LLM Stats tracks 27 models on this benchmark, scored on a 0–1 scale. The current average is 0.7, with the leader at 0.8.
Compare leaders on the best AI for reasoning, best AI for agents, best AI for code and best AI for tool calling leaderboards.
Current leaders
Seed 2.1 Pro from ByteDance currently leads the MCP Atlas leaderboard with a score of 0.838 across 27 evaluated AI models.
FAQ
Common questions about the MCP Atlas benchmark and leaderboard.