A professionally annotated and enriched multimodal data set on popular music

More Info
expand_more

Abstract

This paper presents the MusiClef data set, a multimodal data set of professionally annotated music. It includes editorial metadata about songs, albums, and artists, as well as MusicBrainz identifiers to facilitate linking to other data sets. In addition, several state-of-the-art audio features are provided. Different sets of annotations and music context data - collaboratively generated user tags, web pages about artists and albums, and the annotation labels provided by music experts - are included too. Versions of this data set were used in the MusiClef evaluation campaigns in 2011 and 2012 for auto-tagging tasks. We report on the motivation for the data set, on its composition, on related sets, and on the evaluation campaigns in which versions of the set were already used. These campaigns likewise represent one use case, i.e. music auto-tagging, of the data set. The complete data set is publicly available for download at http://www.cp.jku.at/musiclef.