NQ
Progress Over Time
Interactive timeline showing model performance evolution on NQ
NQ Leaderboard
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
| 1 | 8B | — | — |
What is NQ?
Natural Questions (NQ) benchmark containing real user questions issued to Google search with answers found from Wikipedia, designed for training and evaluation of automatic question answering systems
NQ is a text benchmark evaluating models on reasoning, search, and general tasks. LLM Stats tracks 1 models on this benchmark, scored on a 0–1 scale. The current average is 0.4, with the leader at 0.4.
Compare leaders on the best AI for reasoning, best AI for search and best AI for general leaderboards.
Current leaders
Granite 3.3 8B Base from IBM currently leads the NQ leaderboard with a score of 0.365 across 1 evaluated AI models.
FAQ
Common questions about the NQ benchmark and leaderboard.