Towards better understanding of cover song retrieval
A modular evaluation of system choices
More Info
expand_more
Abstract
A study is presented of the cover song retrieval problem, in which multiple performances of the same musical work are sought. The problem is considered in the context of content-based audio retrieval. In order to gain more insight into the current state-of-the art in cover song retrieval research, several existing approaches to cover song retrieval are studied in a modular way. Modules from different approaches are systematically recombined and each resulting combination is tested on several designated datasets. The modules have been chosen such that they reflect general and independent system decisions, while the datasets were constructed to each pose a specific and known subset of the broad range of cover song types and similarity challenges. For the experiments that are carried out, we depart from cover song system combinations that use the conventional approach of representing songs as absolute chroma vectors over time. Additionally, we transform these representations into a statistical meta-representation and also study the influences of using relative first-order time-differential information. In the results obtained, several critical choices in system modules influencing the overall performance can be identified. Our work is presented by first discussing the cover song retrieval problem in a top-down way, starting from the general musical and technical issues posed that form the inspiration for our choice of considered system modules. Subsequently, we continue by discussing the existing approaches that were studied in more detail. After explaining our experimental setup and evaluation methodology, the results of our work are presented. Finally, considering the results of our work, several suggestions for future directions in cover song retrieval research are made that especially focus at gaining more understanding in the relation between high-level aspects and technical solutions.