Global PIQA
Progress Over Time
Interactive timeline showing model performance evolution on Global PIQA
Global PIQA Leaderboard
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
| 1 | Google | — | — | — | ||
| 2 | Google | — | 1.0M | $0.50 / $3.00 | ||
| 3 | Alibaba Cloud / Qwen Team | — | 1.0M | $1.25 / $3.75 | ||
| 4 | Alibaba Cloud / Qwen Team | — | 1.0M | $0.32 / $1.28 | ||
| 5 | Alibaba Cloud / Qwen Team | — | 1.0M | $0.50 / $3.00 | ||
| 5 | Alibaba Cloud / Qwen Team | 397B | — | — | ||
| 7 | Alibaba Cloud / Qwen Team | 122B | — | — | ||
| 8 | Alibaba Cloud / Qwen Team | 27B | 262K | $0.30 / $2.40 | ||
| 9 | Alibaba Cloud / Qwen Team | 35B | — | — | ||
| 10 | Alibaba Cloud / Qwen Team | 9B | — | — | ||
| 11 | Alibaba Cloud / Qwen Team | 4B | — | — | ||
| 12 | Alibaba Cloud / Qwen Team | 2B | — | — | ||
| 13 | Alibaba Cloud / Qwen Team | 800M | — | — |
What is Global PIQA?
Global PIQA is a multilingual commonsense reasoning benchmark that evaluates physical interaction knowledge across 100 languages and cultures. It tests AI systems' understanding of physical world knowledge in diverse cultural contexts through multiple choice questions about everyday situations requiring physical commonsense.
Global PIQA is a text benchmark evaluating models on physics, reasoning, and general tasks. LLM Stats tracks 13 models on this benchmark, scored on a 0–1 scale. The current average is 0.8, with the leader at 0.9.
Compare leaders on the best AI for physics, best AI for reasoning and best AI for general leaderboards.
Current leaders
Gemini 3 Pro from Google currently leads the Global PIQA leaderboard with a score of 0.934 across 13 evaluated AI models.
FAQ
Common questions about the Global PIQA benchmark and leaderboard.