Humanity's Last Exam
Humanity's Last Exam (HLE) is a multi-modal academic benchmark with 2,500 questions across mathematics, humanities, and natural sciences, designed to test LLM capabilities at the frontier of human knowledge with unambiguous, verifiable solutions
Claude Mythos Preview from Anthropic currently leads the Humanity's Last Exam leaderboard with a score of 0.647 across 77 evaluated AI models.