Impact of audio codec and quality on genre classificaton and BPM recognition in Essentia

Bachelor Thesis (2022)
Authors

S. Hulleman (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty
Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Sjoerd Hulleman
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Sjoerd Hulleman
Graduation Date
28-01-2022
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Related content

GitLab repository containing all code used for this research.

https://gitlab.ewi.tudelft.nl/cse3k-21q2-music-faithfulness/project-sjoerd-hulleman
Faculty
Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Music Information Retrieval (MIR) is a field of research that focusses on extracting information from music related data. This includes the genre of music and the beats per minute (BPM) of a song. Pipelines that extract this information from music are called feature extractors. Essentia is a library for such feature extraction. Often, the audio codec and quality is not considered in research setups within the field of MIR, while this could have an influence on the results. Therefore the main research question is "How do different audio codecs and audio quality impact genre classification and beats per minute (BPM) recognition in Essentia?". To answer this, the genre has been narrowed down to rock and the chosen audio codecs are FLAC, MP3 LAME and OGG Voribs. In collaboration with Muziekweb, a Dutch music library that collects all music that has been released in The Netherlands, it was possible to gather music files in lossless format. To degrade the audio quality, classify songs and recognize BPM, python pipelines for codec conversion, rock genre classification and BPM recognition were created an ran on this data. It has been concluded that changes in audio codec and quality have an influence on genre classification and BPM recognition in Essentia. It has not been concluded which codec and quality is best to use in the field of MIR. Further research is needed to answer this.

Files

License info not available