BenchCAD
Progress Over Time
Interactive timeline showing model performance evolution on BenchCAD
BenchCAD Leaderboard
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
| 1 | Anthropic | — | 1.0M | $3.00 / $15.00 |
What is BenchCAD?
BenchCAD is a benchmark for programmatic CAD reasoning built from 17,900 execution-verified CadQuery programs spanning 106 industrial part families, roughly half anchored to real ISO, DIN, EN, ASME, and IEC specification tables. It decomposes CAD capability into matched tasks; the Vision2Code task requires models to generate CadQuery code from multi-view renders, scored by voxel IoU against the reference geometry.
BenchCAD is a multimodal benchmark evaluating models on multimodal, reasoning, code, and vision tasks. LLM Stats tracks 1 models on this benchmark, scored on a 0–1 scale. The current average is 0.4, with the leader at 0.4.
Compare leaders on the best AI for multimodal, best AI for reasoning, best AI for code and best AI for vision leaderboards.
Current leaders
Claude Sonnet 5 from Anthropic currently leads the BenchCAD leaderboard with a score of 0.373 across 1 evaluated AI models.
FAQ
Common questions about the BenchCAD benchmark and leaderboard.