Include
Include benchmark - specific documentation not found in official sources
Progress Over Time
Interactive timeline showing model performance evolution on Include
State-of-the-art frontier
Open
Proprietary
Include Leaderboard
27 models • 0 verified
| Context | Cost | License | ||||
|---|---|---|---|---|---|---|
1 | Alibaba Cloud / Qwen Team | 397B | — | — | ||
2 | Qwen3.6 PlusNew Alibaba Cloud / Qwen Team | — | — | — | ||
3 | Alibaba Cloud / Qwen Team | 122B | — | — | ||
4 | Alibaba Cloud / Qwen Team | 27B | — | — | ||
5 | Alibaba Cloud / Qwen Team | 236B | — | — | ||
5 | Alibaba Cloud / Qwen Team | 236B | — | — | ||
7 | Alibaba Cloud / Qwen Team | 35B | — | — | ||
8 | Alibaba Cloud / Qwen Team | 235B | — | — | ||
9 | Alibaba Cloud / Qwen Team | 80B | — | — | ||
9 | Alibaba Cloud / Qwen Team | 80B | — | — | ||
11 | Alibaba Cloud / Qwen Team | 33B | — | — | ||
12 | Alibaba Cloud / Qwen Team | 9B | — | — | ||
13 | Alibaba Cloud / Qwen Team | 31B | — | — | ||
14 | Alibaba Cloud / Qwen Team | 33B | — | — | ||
15 | Alibaba Cloud / Qwen Team | 235B | — | — | ||
16 | Alibaba Cloud / Qwen Team | 31B | — | — | ||
17 | Alibaba Cloud / Qwen Team | 4B | — | — | ||
18 | Alibaba Cloud / Qwen Team | 9B | — | — | ||
19 | Alibaba Cloud / Qwen Team | 9B | — | — | ||
20 | Alibaba Cloud / Qwen Team | 4B | — | — | ||
21 | Alibaba Cloud / Qwen Team | 4B | — | — | ||
22 | Google | 8B | — | — | ||
22 | 2B | — | — | |||
24 | Alibaba Cloud / Qwen Team | 2B | — | — | ||
25 | Alibaba Cloud / Qwen Team | 800M | — | — | ||
26 | Google | 8B | — | — | ||
26 | 2B | — | — |
Notice missing or incorrect data?
FAQ
Common questions about Include
Include benchmark - specific documentation not found in official sources
The Include leaderboard ranks 27 AI models based on their performance on this benchmark. Currently, Qwen3.5-397B-A17B by Alibaba Cloud / Qwen Team leads with a score of 0.856. The average score across all models is 0.696.
The highest Include score is 0.856, achieved by Qwen3.5-397B-A17B from Alibaba Cloud / Qwen Team.
27 models have been evaluated on the Include benchmark, with 0 verified results and 27 self-reported results.
Include is categorized under general. The benchmark evaluates text models.