Benchmarks/agents/MLE-Bench Lite

MLE-Bench Lite

MLE-Bench Lite evaluates AI agents on machine learning engineering tasks, testing their ability to build, train, and optimize ML models for Kaggle-style competitions in a lightweight evaluation format.

Progress Over Time

Interactive timeline showing model performance evolution on MLE-Bench Lite

State-of-the-art frontier
Open
Proprietary

MLE-Bench Lite Leaderboard

1 models
ContextCostLicense
1
Notice missing or incorrect data?

FAQ

Common questions about MLE-Bench Lite

MLE-Bench Lite evaluates AI agents on machine learning engineering tasks, testing their ability to build, train, and optimize ML models for Kaggle-style competitions in a lightweight evaluation format.
The MLE-Bench Lite leaderboard ranks 1 AI models based on their performance on this benchmark. Currently, MiniMax M2.7 by MiniMax leads with a score of 0.666. The average score across all models is 0.666.
The highest MLE-Bench Lite score is 0.666, achieved by MiniMax M2.7 from MiniMax.
1 models have been evaluated on the MLE-Bench Lite benchmark, with 0 verified results and 1 self-reported results.
MLE-Bench Lite is categorized under agents and coding. The benchmark evaluates text models.