GiantSteps Tempo

A dataset for tempo estimation in electronic dance music containing 664 2-minute audio previews from Beatport, annotated from user corrections for evaluating automatic tempo estimation algorithms.

Qwen2.5-Omni-7B from Alibaba Cloud / Qwen Team currently leads the GiantSteps Tempo leaderboard with a score of 0.880 across 1 evaluated AI models.

Paper

Qwen2.5-Omni-7B leads with 88.0%.

Progress Over Time

Interactive timeline showing model performance evolution on GiantSteps Tempo

State-of-the-art frontier

Open

Proprietary

GiantSteps Tempo Leaderboard

1 models

				Context	Cost	License
1	Qwen2.5-Omni-7B Alibaba Cloud / Qwen Team		7B	—	—

Notice missing or incorrect data?

FAQ

Common questions about GiantSteps Tempo.

What is the GiantSteps Tempo benchmark?

A dataset for tempo estimation in electronic dance music containing 664 2-minute audio previews from Beatport, annotated from user corrections for evaluating automatic tempo estimation algorithms.

What is the GiantSteps Tempo leaderboard?

The GiantSteps Tempo leaderboard ranks 1 AI models based on their performance on this benchmark. Currently, Qwen2.5-Omni-7B by Alibaba Cloud / Qwen Team leads with a score of 0.880. The average score across all models is 0.880.

What is the highest GiantSteps Tempo score?

The highest GiantSteps Tempo score is 0.880, achieved by Qwen2.5-Omni-7B from Alibaba Cloud / Qwen Team.

How many models are evaluated on GiantSteps Tempo?

1 models have been evaluated on the GiantSteps Tempo benchmark, with 0 verified results and 1 self-reported results.

Where can I find the GiantSteps Tempo paper?

The GiantSteps Tempo paper is available at https://archives.ismir.net/ismir2015/paper/000246.pdf. The paper details the methodology, dataset construction, and evaluation criteria.

What categories does GiantSteps Tempo cover?

GiantSteps Tempo is categorized under audio. The benchmark evaluates audio models.

More evaluations to explore

Related benchmarks in the same category

View all audio →

CoVoST2

CoVoST 2 is a large-scale multilingual speech translation corpus derived from Common Voice, covering translations from 21 languages into English and from English into 15 languages. The dataset contains 2,880 hours of speech with 78K speakers for speech translation research.

audioaudio

2 models

Common Voice 15

Common Voice is a massively-multilingual collection of transcribed speech intended for speech technology research and development. Version 15.0 contains 28,750 recorded hours across 114 languages, consisting of crowdsourced voice recordings with corresponding transcriptions.

audioaudio

1 models

CoVoST2 en-zh

CoVoST 2 English-to-Chinese subset is part of the large-scale multilingual speech translation corpus derived from Common Voice. This subset focuses specifically on English to Chinese speech translation tasks within the broader CoVoST 2 dataset.

audioaudio

1 models

MMAU

A massive multi-task audio understanding and reasoning benchmark comprising 10,000 carefully curated audio clips paired with human-annotated natural language questions spanning speech, environmental sounds, and music. Requires expert-level knowledge and complex reasoning across 27 distinct skills.

audiomultimodal

1 models

MMAU Music

A subset of the MMAU benchmark focused specifically on music understanding and reasoning tasks. Part of a comprehensive multimodal audio understanding benchmark that evaluates models on expert-level knowledge and complex reasoning across music audio clips.

audiomultimodal

1 models

MMAU Sound

A subset of the MMAU benchmark focused specifically on environmental sound understanding and reasoning tasks. Part of a comprehensive multimodal audio understanding benchmark that evaluates models on expert-level knowledge and complex reasoning across environmental sound clips.

audiomultimodal

1 models