BenchCAD

Progress Over Time

Interactive timeline showing model performance evolution on BenchCAD

State-of-the-art frontier
Open
Proprietary

BenchCAD Leaderboard

1 models
ContextCostLicense
1
Anthropic
Anthropic
1.0M$3.00 / $15.00
Notice missing or incorrect data?
About this benchmark

What is BenchCAD?

BenchCAD is a benchmark for programmatic CAD reasoning built from 17,900 execution-verified CadQuery programs spanning 106 industrial part families, roughly half anchored to real ISO, DIN, EN, ASME, and IEC specification tables. It decomposes CAD capability into matched tasks; the Vision2Code task requires models to generate CadQuery code from multi-view renders, scored by voxel IoU against the reference geometry.

BenchCAD is a multimodal benchmark evaluating models on multimodal, reasoning, code, and vision tasks. LLM Stats tracks 1 models on this benchmark, scored on a 0–1 scale. The current average is 0.4, with the leader at 0.4.

Compare leaders on the best AI for multimodal, best AI for reasoning, best AI for code and best AI for vision leaderboards.

Current leaders

Claude Sonnet 5 from Anthropic currently leads the BenchCAD leaderboard with a score of 0.373 across 1 evaluated AI models.

1Claude Sonnet 5Anthropic37.3%

FAQ

Common questions about the BenchCAD benchmark and leaderboard.

What is the BenchCAD benchmark?

BenchCAD is a benchmark for programmatic CAD reasoning built from 17,900 execution-verified CadQuery programs spanning 106 industrial part families, roughly half anchored to real ISO, DIN, EN, ASME, and IEC specification tables. It decomposes CAD capability into matched tasks; the Vision2Code task requires models to generate CadQuery code from multi-view renders, scored by voxel IoU against the reference geometry.

What is the BenchCAD leaderboard?

The BenchCAD leaderboard ranks 1 AI models based on their performance on this benchmark. Currently, Claude Sonnet 5 by Anthropic leads with a score of 0.373. The average score across all models is 0.373.

What is the highest BenchCAD score?

The highest BenchCAD score is 0.373, achieved by Claude Sonnet 5 from Anthropic.

How many models are evaluated on BenchCAD?

1 models have been evaluated on the BenchCAD benchmark, with 0 verified results and 1 self-reported results.

What categories does BenchCAD cover?

BenchCAD is categorized under multimodal, reasoning, code, and vision. The benchmark evaluates multimodal models.

Which model offers the best value on BenchCAD?

Among models scoring within 10% of the leader, Claude Sonnet 5 from Anthropic is the cheapest, at $3.00 per million input tokens with a score of 0.373.

How recent are the BenchCAD leaderboard results?

The BenchCAD leaderboard was last updated in June 2026 and currently includes 1 evaluated models.