Benchmarks/math/SAT Math

SAT Math

SAT Math benchmark from AGIEval containing standardized mathematics questions from the College Board SAT examination, designed to evaluate mathematical reasoning capabilities of foundation models using human-centric assessment methods.

Paper

Progress Over Time

Interactive timeline showing model performance evolution on SAT Math

State-of-the-art frontier
Open
Proprietary

SAT Math Leaderboard

1 models
ContextCostLicense
1
OpenAI
OpenAI
33K$30.00 / $60.00
Notice missing or incorrect data?

FAQ

Common questions about SAT Math

SAT Math benchmark from AGIEval containing standardized mathematics questions from the College Board SAT examination, designed to evaluate mathematical reasoning capabilities of foundation models using human-centric assessment methods.
The SAT Math paper is available at https://arxiv.org/abs/2304.06364. This paper provides detailed information about the benchmark methodology, dataset creation, and evaluation criteria.
The SAT Math leaderboard ranks 1 AI models based on their performance on this benchmark. Currently, GPT-4 by OpenAI leads with a score of 0.890. The average score across all models is 0.890.
The highest SAT Math score is 0.890, achieved by GPT-4 from OpenAI.
1 models have been evaluated on the SAT Math benchmark, with 0 verified results and 1 self-reported results.
SAT Math is categorized under math and reasoning. The benchmark evaluates text models.