SWE-Atlas

Progress Over Time

Interactive timeline showing model performance evolution on SWE-Atlas

State-of-the-art frontier
Open
Proprietary

SWE-Atlas Leaderboard

2 models
ContextCostLicense
1
ByteDance
ByteDance
2
ByteDance
ByteDance
Notice missing or incorrect data?
About this benchmark

What is SWE-Atlas?

SWE-Atlas is a software engineering benchmark focused on debugging, evaluating a model's ability to localize and fix bugs in real-world codebases.

SWE-Atlas is a text benchmark evaluating models on agents and coding tasks. LLM Stats tracks 2 models on this benchmark, scored on a 0–1 scale. The current average is 0.3, with the leader at 0.4.

Compare leaders on the best AI for agents and best AI for coding leaderboards.

Current leaders

Seed 2.1 Pro from ByteDance currently leads the SWE-Atlas leaderboard with a score of 0.352 across 2 evaluated AI models.

1Seed 2.1 ProByteDance35.2%
2Seed 2.1 TurboByteDance30.6%

FAQ

Common questions about the SWE-Atlas benchmark and leaderboard.

What is the SWE-Atlas benchmark?

SWE-Atlas is a software engineering benchmark focused on debugging, evaluating a model's ability to localize and fix bugs in real-world codebases.

What is the SWE-Atlas leaderboard?

The SWE-Atlas leaderboard ranks 2 AI models based on their performance on this benchmark. Currently, Seed 2.1 Pro by ByteDance leads with a score of 0.352. The average score across all models is 0.329.

What is the highest SWE-Atlas score?

The highest SWE-Atlas score is 0.352, achieved by Seed 2.1 Pro from ByteDance.

How many models are evaluated on SWE-Atlas?

2 models have been evaluated on the SWE-Atlas benchmark, with 0 verified results and 2 self-reported results.

What categories does SWE-Atlas cover?

SWE-Atlas is categorized under agents and coding. The benchmark evaluates text models.

How recent are the SWE-Atlas leaderboard results?

The SWE-Atlas leaderboard was last updated in June 2026 and currently includes 2 evaluated models.