HealthBench Professional

Paper

Progress Over Time

Interactive timeline showing model performance evolution on HealthBench Professional

State-of-the-art frontier
Open
Proprietary

HealthBench Professional Leaderboard

4 models
ContextCostLicense
1
21.0M$5.00 / $25.00
3400K$5.00 / $30.00
41.0T
Notice missing or incorrect data?
About this benchmark

What is HealthBench Professional?

HealthBench Professional evaluates model capability and safety for clinician use cases using real clinician-style chats and physician-authored grading rubrics.

HealthBench Professional is a text benchmark evaluating models on healthcare tasks. LLM Stats tracks 4 models on this benchmark, scored on a 0–1 scale. The current average is 0.5, with the leader at 0.7.

Compare leaders on the best AI for healthcare leaderboards.

Current leaders

Claude Fable 5 from Anthropic currently leads the HealthBench Professional leaderboard with a score of 0.660 across 4 evaluated AI models.

1Claude Fable 5Anthropic66.0%
2Claude Opus 4.8Anthropic55.8%
3GPT-5.5 InstantOpenAI38.4%

FAQ

Common questions about the HealthBench Professional benchmark and leaderboard.

What is the HealthBench Professional benchmark?

HealthBench Professional evaluates model capability and safety for clinician use cases using real clinician-style chats and physician-authored grading rubrics.

What is the HealthBench Professional leaderboard?

The HealthBench Professional leaderboard ranks 4 AI models based on their performance on this benchmark. Currently, Claude Fable 5 by Anthropic leads with a score of 0.660. The average score across all models is 0.488.

What is the highest HealthBench Professional score?

The highest HealthBench Professional score is 0.660, achieved by Claude Fable 5 from Anthropic.

How many models are evaluated on HealthBench Professional?

4 models have been evaluated on the HealthBench Professional benchmark, with 0 verified results and 4 self-reported results.

Where can I find the HealthBench Professional paper?

The HealthBench Professional paper is available at https://cdn.openai.com/dd128428-0184-4e25-b155-3a7686c7d744/HealthBench-Professional.pdf. The paper details the methodology, dataset construction, and evaluation criteria.

What categories does HealthBench Professional cover?

HealthBench Professional is categorized under healthcare. The benchmark evaluates text models.

What's the difference between HealthBench Professional and HealthBench?

HealthBench Professional is a variant of HealthBench. See the HealthBench leaderboard for the broader benchmark and per-model comparison.

How recent are the HealthBench Professional leaderboard results?

The HealthBench Professional leaderboard was last updated in June 2026 and currently includes 4 evaluated models.