RULER 32k

RULER 32k evaluates the official 13-task RULER v1 suite at a 32768-token context budget.

PaperImplementation

Progress Over Time

Interactive timeline showing model performance evolution on RULER 32k

No timeline data available

RULER 32k Leaderboard

0 models • 0 verified
ContextCostLicense
Notice missing or incorrect data?

FAQ

Common questions about RULER 32k

RULER 32k evaluates the official 13-task RULER v1 suite at a 32768-token context budget.
The RULER 32k paper is available at https://arxiv.org/abs/2404.06654. This paper provides detailed information about the benchmark methodology, dataset creation, and evaluation criteria.
The RULER 32k dataset is available at https://github.com/NVIDIA/RULER.
RULER 32k is categorized under long context and reasoning. The benchmark evaluates text models.