RULER 8k

RULER 8k evaluates the official 13-task RULER v1 suite at an 8192-token context budget.

PaperImplementation

Progress Over Time

Interactive timeline showing model performance evolution on RULER 8k

No timeline data available

RULER 8k Leaderboard

0 models • 0 verified
ContextCostLicense
Notice missing or incorrect data?

FAQ

Common questions about RULER 8k

RULER 8k evaluates the official 13-task RULER v1 suite at an 8192-token context budget.
The RULER 8k paper is available at https://arxiv.org/abs/2404.06654. This paper provides detailed information about the benchmark methodology, dataset creation, and evaluation criteria.
The RULER 8k dataset is available at https://github.com/NVIDIA/RULER.
RULER 8k is categorized under long context and reasoning. The benchmark evaluates text models.