NQ

Paper

Progress Over Time

Interactive timeline showing model performance evolution on NQ

State-of-the-art frontier
Open
Proprietary

NQ Leaderboard

1 models
ContextCostLicense
18B
Notice missing or incorrect data?
About this benchmark

What is NQ?

Natural Questions (NQ) benchmark containing real user questions issued to Google search with answers found from Wikipedia, designed for training and evaluation of automatic question answering systems

NQ is a text benchmark evaluating models on reasoning, search, and general tasks. LLM Stats tracks 1 models on this benchmark, scored on a 0–1 scale. The current average is 0.4, with the leader at 0.4.

Compare leaders on the best AI for reasoning, best AI for search and best AI for general leaderboards.

Current leaders

Granite 3.3 8B Base from IBM currently leads the NQ leaderboard with a score of 0.365 across 1 evaluated AI models.

FAQ

Common questions about the NQ benchmark and leaderboard.

What is the NQ benchmark?

Natural Questions (NQ) benchmark containing real user questions issued to Google search with answers found from Wikipedia, designed for training and evaluation of automatic question answering systems

What is the NQ leaderboard?

The NQ leaderboard ranks 1 AI models based on their performance on this benchmark. Currently, Granite 3.3 8B Base by IBM leads with a score of 0.365. The average score across all models is 0.365.

What is the highest NQ score?

The highest NQ score is 0.365, achieved by Granite 3.3 8B Base from IBM.

How many models are evaluated on NQ?

1 models have been evaluated on the NQ benchmark, with 0 verified results and 1 self-reported results.

Where can I find the NQ paper?

The NQ paper is available at https://aclanthology.org/Q19-1026/. The paper details the methodology, dataset construction, and evaluation criteria.

What categories does NQ cover?

NQ is categorized under reasoning, search, and general. The benchmark evaluates text models.

What is the best open-source model on NQ?

Granite 3.3 8B Base by IBM is the top-ranked open-source model on NQ, with a score of 0.365 (rank #1).

How recent are the NQ leaderboard results?

The NQ leaderboard was last updated in July 2026 and currently includes 1 evaluated models.