MMStar
MMStar is an elite vision-indispensable multimodal benchmark comprising 1,500 challenge samples meticulously selected by humans to evaluate 6 core capabilities and 18 detailed axes. The benchmark addresses issues of visual content unnecessity and unintentional data leakage in existing multimodal evaluations.
Qwen3.6 Plus from Alibaba Cloud / Qwen Team currently leads the MMStar leaderboard with a score of 0.833 across 22 evaluated AI models.