Artifacts Bench

Name: Artifacts Bench Leaderboard — AI Model Scores
Creator: LLM Stats
License: https://llm-stats.com/legal/terms-of-service

Progress Over Time

Interactive timeline showing model performance evolution on Artifacts Bench

State-of-the-art frontier

Open

Proprietary

Artifacts Bench Leaderboard

3 models

			Context	Cost
1	Seed 2.1 Pro ByteDance	—	—	—
2	Seed 2.1 Turbo ByteDance	—	—	—
3	MAI-Code-1-Flash Microsoft	—	—	—

Notice missing or incorrect data?

About this benchmark

What is Artifacts Bench?

Artifacts Bench evaluates a model's ability to generate visual code artifacts, measuring the quality of generated interactive and visual front-end outputs from natural-language requests.

Artifacts Bench is a text benchmark evaluating models on frontend development and code tasks. LLM Stats tracks 3 models on this benchmark, scored on a 0–1 scale. The current average is 0.4, with the leader at 0.5.

Compare leaders on the best AI for frontend development and best AI for code leaderboards.

Current leaders

Seed 2.1 Pro from ByteDance currently leads the Artifacts Bench leaderboard with a score of 0.510 across 3 evaluated AI models.

Seed 2.1 ProByteDance51.0%

Seed 2.1 TurboByteDance47.0%

MAI-Code-1-FlashMicrosoft36.4%

FAQ

Common questions about the Artifacts Bench benchmark and leaderboard.

What is the Artifacts Bench benchmark?

Artifacts Bench evaluates a model's ability to generate visual code artifacts, measuring the quality of generated interactive and visual front-end outputs from natural-language requests.

What is the Artifacts Bench leaderboard?

The Artifacts Bench leaderboard ranks 3 AI models based on their performance on this benchmark. Currently, Seed 2.1 Pro by ByteDance leads with a score of 0.510. The average score across all models is 0.448.

What is the highest Artifacts Bench score?

The highest Artifacts Bench score is 0.510, achieved by Seed 2.1 Pro from ByteDance.

How many models are evaluated on Artifacts Bench?

3 models have been evaluated on the Artifacts Bench benchmark, with 0 verified results and 3 self-reported results.

What categories does Artifacts Bench cover?

Artifacts Bench is categorized under frontend development and code. The benchmark evaluates text models.

How recent are the Artifacts Bench leaderboard results?

The Artifacts Bench leaderboard was last updated in July 2026 and currently includes 3 evaluated models.