Finding Similarities between Measures in Scanned Music Scores

More Info
expand_more

Abstract

The field of Optical Music Recognition has been making progress in the past decades to automate the process of transcribing music scores into computer-readable formats, but its results are still far from being generally applicable. Some research effort has focused on incorporating crowdsourcing techniques into this field to check and correct errors in existing methods. However, since this is a costly process, improvements in efficiency are key to making these systems applicable for music transcription. A possible solution is to identify which measures are similar to each other, so that obtained transcriptions can be shared between them.

In this thesis, we propose a system that generates clusters of measures that are segmented from music scores, in order to find measures that are similar to each other. To design and evaluate this system, we have collected an extensive dataset of published music scores and their symbolic transcriptions. We have manually created annotations that map the symbolic scores to the printed scores on the measure level, so that we can obtain a ground truth from it. We have improved an existing staff detection algorithm and compared it to the original using our dataset. We have also developed our own barline detection algorithm and evaluated it against the dataset. We used both these algorithms to segment the pages of music into images of the individual measures of our dataset. We used the FastPAM k-medoids clustering algorithm with distances between measures based on Dynamic Time Warping to build clusters of measures that are rhythmically similar to each other. Using the ground truth, we evaluated the clusters we obtained. Although we notice that the selected method of clustering does not always separate the measures into clear rhythmic classes, we discuss possible solutions for it and suggest how the results can still be applicable in the crowdsourcing setting.