CC-Bench-V2 Frontend

CC-Bench-V2 Frontend evaluates coding agents on frontend development tasks, measuring ability to build UI components, handle styling, and implement client-side logic.

GLM-5V-Turbo from Zhipu AI currently leads the CC-Bench-V2 Frontend leaderboard with a score of 0.684 across 1 evaluated AI models.

GLM-5V-Turbo leads with 68.4%.

Progress Over Time

Interactive timeline showing model performance evolution on CC-Bench-V2 Frontend

State-of-the-art frontier

Open

Proprietary

CC-Bench-V2 Frontend Leaderboard

1 models

				Context	Cost	License
1	GLM-5V-Turbo Zhipu AI		—	—	—

Notice missing or incorrect data?

FAQ

Common questions about CC-Bench-V2 Frontend.

What is the CC-Bench-V2 Frontend benchmark?

CC-Bench-V2 Frontend evaluates coding agents on frontend development tasks, measuring ability to build UI components, handle styling, and implement client-side logic.

What is the CC-Bench-V2 Frontend leaderboard?

The CC-Bench-V2 Frontend leaderboard ranks 1 AI models based on their performance on this benchmark. Currently, GLM-5V-Turbo by Zhipu AI leads with a score of 0.684. The average score across all models is 0.684.

What is the highest CC-Bench-V2 Frontend score?

The highest CC-Bench-V2 Frontend score is 0.684, achieved by GLM-5V-Turbo from Zhipu AI.

How many models are evaluated on CC-Bench-V2 Frontend?

1 models have been evaluated on the CC-Bench-V2 Frontend benchmark, with 0 verified results and 1 self-reported results.

What categories does CC-Bench-V2 Frontend cover?

CC-Bench-V2 Frontend is categorized under coding. The benchmark evaluates text models.

More evaluations to explore

Related benchmarks in the same category

View all coding →

Claw-Eval

Claw-Eval tests real-world agentic task completion across complex multi-step scenarios, evaluating a model's ability to use tools, navigate environments, and complete end-to-end tasks autonomously.

coding

7 models

NL2Repo

NL2Repo evaluates long-horizon coding capabilities including repository-level understanding, where models must generate or modify code across entire repositories from natural language specifications.

coding

5 models

PinchBench

PinchBench evaluates coding agents on real-world agentic coding tasks, measuring both best-case and average performance across complex software engineering scenarios.

coding

3 models

SkillsBench

SkillsBench evaluates coding agents on self-contained programming tasks, measuring practical engineering skills across diverse software development scenarios.

coding

3 models

ZClawBench

ZClawBench evaluates Claw-style agent task execution quality, measuring a model's ability to autonomously complete complex multi-step coding tasks in real-world environments.

coding

3 models

CC-Bench-V2 Backend

CC-Bench-V2 Backend evaluates coding agents on backend development tasks, measuring practical engineering ability to implement server-side logic, APIs, and system components.

coding

1 models