CC-Bench-V2 Frontend
CC-Bench-V2 Frontend evaluates coding agents on frontend development tasks, measuring ability to build UI components, handle styling, and implement client-side logic.
GLM-5V-Turbo from Zhipu AI currently leads the CC-Bench-V2 Frontend leaderboard with a score of 0.684 across 1 evaluated AI models.
GLM-5V-Turbo leads with 68.4%.
Progress Over Time
Interactive timeline showing model performance evolution on CC-Bench-V2 Frontend
CC-Bench-V2 Frontend Leaderboard
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
| 1 | Zhipu AI | — | — | — |
FAQ
Common questions about CC-Bench-V2 Frontend.
More evaluations to explore
Related benchmarks in the same category
Claw-Eval tests real-world agentic task completion across complex multi-step scenarios, evaluating a model's ability to use tools, navigate environments, and complete end-to-end tasks autonomously.
NL2Repo evaluates long-horizon coding capabilities including repository-level understanding, where models must generate or modify code across entire repositories from natural language specifications.
PinchBench evaluates coding agents on real-world agentic coding tasks, measuring both best-case and average performance across complex software engineering scenarios.
SkillsBench evaluates coding agents on self-contained programming tasks, measuring practical engineering skills across diverse software development scenarios.
ZClawBench evaluates Claw-style agent task execution quality, measuring a model's ability to autonomously complete complex multi-step coding tasks in real-world environments.
CC-Bench-V2 Backend evaluates coding agents on backend development tasks, measuring practical engineering ability to implement server-side logic, APIs, and system components.