Global PIQA
Global PIQA is a multilingual commonsense reasoning benchmark that evaluates physical interaction knowledge across 100 languages and cultures. It tests AI systems' understanding of physical world knowledge in diverse cultural contexts through multiple choice questions about everyday situations requiring physical commonsense.
Progress Over Time
Interactive timeline showing model performance evolution on Global PIQA
State-of-the-art frontier
Open
Proprietary
Global PIQA Leaderboard
2 models • 0 verified
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
1 | Google | — | — | — | ||
2 | Google | — | — | — |
Notice missing or incorrect data?
FAQ
Common questions about Global PIQA
Global PIQA is a multilingual commonsense reasoning benchmark that evaluates physical interaction knowledge across 100 languages and cultures. It tests AI systems' understanding of physical world knowledge in diverse cultural contexts through multiple choice questions about everyday situations requiring physical commonsense.
The Global PIQA leaderboard ranks 2 AI models based on their performance on this benchmark. Currently, Gemini 3 Pro by Google leads with a score of 0.934. The average score across all models is 0.931.
The highest Global PIQA score is 0.934, achieved by Gemini 3 Pro from Google.
2 models have been evaluated on the Global PIQA benchmark, with 0 verified results and 2 self-reported results.
Global PIQA is categorized under general, physics, and reasoning. The benchmark evaluates text models with multilingual support.