xDailyBench

Progress Over Time

Interactive timeline showing model performance evolution on xDailyBench

State-of-the-art frontier
Open
Proprietary

xDailyBench Leaderboard

2 models
ContextCostLicense
1
ByteDance
ByteDance
2
ByteDance
ByteDance
Notice missing or incorrect data?
About this benchmark

What is xDailyBench?

xDailyBench evaluates AI agents on white-collar office work, covering everyday professional tasks such as document handling, consultation, and multi-step productivity workflows.

xDailyBench is a text benchmark evaluating models on reasoning, general, and agents tasks. LLM Stats tracks 2 models on this benchmark, scored on a 0–1 scale. The current average is 0.6, with the leader at 0.6.

Compare leaders on the best AI for reasoning, best AI for general and best AI for agents leaderboards.

Current leaders

Seed 2.1 Pro from ByteDance currently leads the xDailyBench leaderboard with a score of 0.610 across 2 evaluated AI models.

1Seed 2.1 ProByteDance61.0%
2Seed 2.1 TurboByteDance56.4%

FAQ

Common questions about the xDailyBench benchmark and leaderboard.

What is the xDailyBench benchmark?

xDailyBench evaluates AI agents on white-collar office work, covering everyday professional tasks such as document handling, consultation, and multi-step productivity workflows.

What is the xDailyBench leaderboard?

The xDailyBench leaderboard ranks 2 AI models based on their performance on this benchmark. Currently, Seed 2.1 Pro by ByteDance leads with a score of 0.610. The average score across all models is 0.587.

What is the highest xDailyBench score?

The highest xDailyBench score is 0.610, achieved by Seed 2.1 Pro from ByteDance.

How many models are evaluated on xDailyBench?

2 models have been evaluated on the xDailyBench benchmark, with 0 verified results and 2 self-reported results.

What categories does xDailyBench cover?

xDailyBench is categorized under reasoning, general, and agents. The benchmark evaluates text models.

How recent are the xDailyBench leaderboard results?

The xDailyBench leaderboard was last updated in June 2026 and currently includes 2 evaluated models.