HealthBench Professional

Name: HealthBench Professional Leaderboard — AI Model Scores
Creator: LLM Stats
License: https://llm-stats.com/legal/terms-of-service

Paper

Progress Over Time

Interactive timeline showing model performance evolution on HealthBench Professional

State-of-the-art frontier

Open

Proprietary

HealthBench Professional Leaderboard

4 models

			Context	Cost
1	Claude Fable 5 Anthropic	—	—	—
2	Claude Opus 4.8 Anthropic	—	1.0M	$5.00 / $25.00
3	GPT-5.5 Instant OpenAI	—	400K	$5.00 / $30.00
4	MAI-Thinking-1 Microsoft	1.0T	—	—

Notice missing or incorrect data?

About this benchmark

What is HealthBench Professional?

HealthBench Professional evaluates model capability and safety for clinician use cases using real clinician-style chats and physician-authored grading rubrics.

HealthBench Professional is a text benchmark evaluating models on healthcare tasks. LLM Stats tracks 4 models on this benchmark, scored on a 0–1 scale. The current average is 0.5, with the leader at 0.7.

Compare leaders on the best AI for healthcare leaderboards.

Current leaders

Claude Fable 5 from Anthropic currently leads the HealthBench Professional leaderboard with a score of 0.660 across 4 evaluated AI models.

Claude Fable 5Anthropic66.0%

Claude Opus 4.8Anthropic55.8%

GPT-5.5 InstantOpenAI38.4%

FAQ

Common questions about the HealthBench Professional benchmark and leaderboard.

What is the HealthBench Professional benchmark?

HealthBench Professional evaluates model capability and safety for clinician use cases using real clinician-style chats and physician-authored grading rubrics.

What is the HealthBench Professional leaderboard?

The HealthBench Professional leaderboard ranks 4 AI models based on their performance on this benchmark. Currently, Claude Fable 5 by Anthropic leads with a score of 0.660. The average score across all models is 0.488.

What is the highest HealthBench Professional score?

The highest HealthBench Professional score is 0.660, achieved by Claude Fable 5 from Anthropic.

How many models are evaluated on HealthBench Professional?

4 models have been evaluated on the HealthBench Professional benchmark, with 0 verified results and 4 self-reported results.

Where can I find the HealthBench Professional paper?

The HealthBench Professional paper is available at https://cdn.openai.com/dd128428-0184-4e25-b155-3a7686c7d744/HealthBench-Professional.pdf. The paper details the methodology, dataset construction, and evaluation criteria.

What categories does HealthBench Professional cover?

HealthBench Professional is categorized under healthcare. The benchmark evaluates text models.

What's the difference between HealthBench Professional and HealthBench?

HealthBench Professional is a variant of HealthBench. See the HealthBench leaderboard for the broader benchmark and per-model comparison.

How recent are the HealthBench Professional leaderboard results?

The HealthBench Professional leaderboard was last updated in June 2026 and currently includes 4 evaluated models.