Kimi Code Bench v2

Name: Kimi Code Bench v2 Leaderboard — AI Model Scores
Creator: LLM Stats
License: https://llm-stats.com/legal/terms-of-service

Progress Over Time

Interactive timeline showing model performance evolution on Kimi Code Bench v2

State-of-the-art frontier

Open

Proprietary

Kimi Code Bench v2 Leaderboard

2 models

				Context	Cost	License
1	Kimi K3 Moonshot AI		2.8T	1.0M	$3.00 / $15.00
2	Kimi K2.7 Code Moonshot AI		1.0T	262K	$0.74 / $3.50

Notice missing or incorrect data?

About this benchmark

What is Kimi Code Bench v2?

Kimi Code Bench v2 is Moonshot AI's in-house benchmark for evaluating coding agents on realistic software engineering tasks across 10+ mainstream programming languages and a production tech stack spanning backend services, infrastructure, performance engineering, systems programming, security, frontend development, and ML/data engineering.

Kimi Code Bench v2 is a text benchmark evaluating models on agents and code tasks. LLM Stats tracks 2 models on this benchmark, scored on a 0–1 scale. The current average is 0.7, with the leader at 0.7.

Compare leaders on the best AI for agents and best AI for code leaderboards.

Current leaders

Kimi K3 from Moonshot AI currently leads the Kimi Code Bench v2 leaderboard with a score of 0.729 across 2 evaluated AI models.

Kimi K3Moonshot AI72.9%

Kimi K2.7 CodeMoonshot AI62.0%

FAQ

Common questions about the Kimi Code Bench v2 benchmark and leaderboard.

What is the Kimi Code Bench v2 benchmark?

What is the Kimi Code Bench v2 leaderboard?

The Kimi Code Bench v2 leaderboard ranks 2 AI models based on their performance on this benchmark. Currently, Kimi K3 by Moonshot AI leads with a score of 0.729. The average score across all models is 0.674.

What is the highest Kimi Code Bench v2 score?

The highest Kimi Code Bench v2 score is 0.729, achieved by Kimi K3 from Moonshot AI.

How many models are evaluated on Kimi Code Bench v2?

2 models have been evaluated on the Kimi Code Bench v2 benchmark, with 0 verified results and 2 self-reported results.

What categories does Kimi Code Bench v2 cover?

Kimi Code Bench v2 is categorized under agents and code. The benchmark evaluates text models with multilingual support.

What is the best open-source model on Kimi Code Bench v2?

Kimi K3 by Moonshot AI is the top-ranked open-source model on Kimi Code Bench v2, with a score of 0.729 (rank #1).

Which model offers the best value on Kimi Code Bench v2?

Among models scoring within 10% of the leader, Kimi K3 from Moonshot AI is the cheapest, at $3.00 per million input tokens with a score of 0.729.

How recent are the Kimi Code Bench v2 leaderboard results?

The Kimi Code Bench v2 leaderboard was last updated in July 2026 and currently includes 2 evaluated models.