HealthBench Professional
Progress Over Time
Interactive timeline showing model performance evolution on HealthBench Professional
HealthBench Professional Leaderboard
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
| 1 | Anthropic | — | — | — | ||
| 2 | Anthropic | — | 1.0M | $5.00 / $25.00 | ||
| 3 | OpenAI | — | 400K | $5.00 / $30.00 | ||
| 4 | Microsoft | 1.0T | — | — |
What is HealthBench Professional?
HealthBench Professional evaluates model capability and safety for clinician use cases using real clinician-style chats and physician-authored grading rubrics.
HealthBench Professional is a text benchmark evaluating models on healthcare tasks. LLM Stats tracks 4 models on this benchmark, scored on a 0–1 scale. The current average is 0.5, with the leader at 0.7.
Compare leaders on the best AI for healthcare leaderboards.
Current leaders
Claude Fable 5 from Anthropic currently leads the HealthBench Professional leaderboard with a score of 0.660 across 4 evaluated AI models.
FAQ
Common questions about the HealthBench Professional benchmark and leaderboard.