Benchmarks/general/Global PIQA

Global PIQA

Global PIQA is a multilingual commonsense reasoning benchmark that evaluates physical interaction knowledge across 100 languages and cultures. It tests AI systems' understanding of physical world knowledge in diverse cultural contexts through multiple choice questions about everyday situations requiring physical commonsense.

Progress Over Time

Interactive timeline showing model performance evolution on Global PIQA

State-of-the-art frontier
Open
Proprietary

Global PIQA Leaderboard

2 models • 0 verified
ContextCostLicense
1
2
Notice missing or incorrect data?

FAQ

Common questions about Global PIQA

Global PIQA is a multilingual commonsense reasoning benchmark that evaluates physical interaction knowledge across 100 languages and cultures. It tests AI systems' understanding of physical world knowledge in diverse cultural contexts through multiple choice questions about everyday situations requiring physical commonsense.
The Global PIQA leaderboard ranks 2 AI models based on their performance on this benchmark. Currently, Gemini 3 Pro by Google leads with a score of 0.934. The average score across all models is 0.931.
The highest Global PIQA score is 0.934, achieved by Gemini 3 Pro from Google.
2 models have been evaluated on the Global PIQA benchmark, with 0 verified results and 2 self-reported results.
Global PIQA is categorized under general, physics, and reasoning. The benchmark evaluates text models with multilingual support.