Kernel Bench L3

Name: Kernel Bench L3 Leaderboard — AI Model Scores
Creator: LLM Stats
License: https://llm-stats.com/legal/terms-of-service

Progress Over Time

Interactive timeline showing model performance evolution on Kernel Bench L3

State-of-the-art frontier

Open

Proprietary

Kernel Bench L3 Leaderboard

1 models

				Context	Cost	License
1	Qwen3.7 Max Alibaba Cloud / Qwen Team		—	1.0M	$1.25 / $3.75

Notice missing or incorrect data?

About this benchmark

What is Kernel Bench L3?

Kernel Bench L3 evaluates agentic GPU kernel optimization across 50 problems. Qwen reports two metrics for this benchmark: median per-problem speedup over the PyTorch eager reference and the fraction of problems faster than torch.compile.

Kernel Bench L3 is a text benchmark evaluating models on agents, coding, and systems tasks. LLM Stats tracks 1 models on this benchmark, scored on a 0–1 scale. The current average is 1.0, with the leader at 1.0.

Compare leaders on the best AI for agents, best AI for coding and best AI for systems leaderboards.

Current leaders

Qwen3.7 Max from Alibaba Cloud / Qwen Team currently leads the Kernel Bench L3 leaderboard with a score of 0.960 across 1 evaluated AI models.

Qwen3.7 MaxAlibaba Cloud / Qwen Team96.0%

FAQ

Common questions about the Kernel Bench L3 benchmark and leaderboard.

What is the Kernel Bench L3 benchmark?

What is the Kernel Bench L3 leaderboard?

The Kernel Bench L3 leaderboard ranks 1 AI models based on their performance on this benchmark. Currently, Qwen3.7 Max by Alibaba Cloud / Qwen Team leads with a score of 0.960. The average score across all models is 0.960.

What is the highest Kernel Bench L3 score?

The highest Kernel Bench L3 score is 0.960, achieved by Qwen3.7 Max from Alibaba Cloud / Qwen Team.

How many models are evaluated on Kernel Bench L3?

1 models have been evaluated on the Kernel Bench L3 benchmark, with 0 verified results and 1 self-reported results.

What categories does Kernel Bench L3 cover?

Kernel Bench L3 is categorized under agents, coding, and systems. The benchmark evaluates text models.

Which model offers the best value on Kernel Bench L3?

Among models scoring within 10% of the leader, Qwen3.7 Max from Alibaba Cloud / Qwen Team is the cheapest, at $1.25 per million input tokens with a score of 0.960.

How recent are the Kernel Bench L3 leaderboard results?

The Kernel Bench L3 leaderboard was last updated in July 2026 and currently includes 1 evaluated models.