Agents' Last Exam

Name: Agents' Last Exam Leaderboard — AI Model Scores
Creator: LLM Stats
License: https://llm-stats.com/legal/terms-of-service

Progress Over Time

Interactive timeline showing model performance evolution on Agents' Last Exam

State-of-the-art frontier

Open

Proprietary

Agents' Last Exam Leaderboard

1 models

				Context	Cost	License
1	Seed 2.1 ProNew ByteDance		—	—	—

Notice missing or incorrect data?

About this benchmark

What is Agents' Last Exam?

Agents' Last Exam is a challenging benchmark for AI agents on hard, long-horizon tasks that test sustained reasoning, planning, and tool use, reported with and without tool access.

Agents' Last Exam is a text benchmark evaluating models on reasoning, agents, and tool calling tasks. LLM Stats tracks 1 models on this benchmark, scored on a 0–1 scale. The current average is 0.4, with the leader at 0.4.

Compare leaders on the best AI for reasoning, best AI for agents and best AI for tool calling leaderboards.

Current leaders

Seed 2.1 Pro from ByteDance currently leads the Agents' Last Exam leaderboard with a score of 0.414 across 1 evaluated AI models.

Seed 2.1 ProByteDance41.4%

FAQ

Common questions about the Agents' Last Exam benchmark and leaderboard.

What is the Agents' Last Exam benchmark?

Agents' Last Exam is a challenging benchmark for AI agents on hard, long-horizon tasks that test sustained reasoning, planning, and tool use, reported with and without tool access.

What is the Agents' Last Exam leaderboard?

The Agents' Last Exam leaderboard ranks 1 AI models based on their performance on this benchmark. Currently, Seed 2.1 Pro by ByteDance leads with a score of 0.414. The average score across all models is 0.414.

What is the highest Agents' Last Exam score?

The highest Agents' Last Exam score is 0.414, achieved by Seed 2.1 Pro from ByteDance.

How many models are evaluated on Agents' Last Exam?

1 models have been evaluated on the Agents' Last Exam benchmark, with 0 verified results and 1 self-reported results.

What categories does Agents' Last Exam cover?

Agents' Last Exam is categorized under reasoning, agents, and tool calling. The benchmark evaluates text models.

How recent are the Agents' Last Exam leaderboard results?

The Agents' Last Exam leaderboard was last updated in June 2026 and currently includes 1 evaluated models.