Crowd-Assisted Annotation of Classical Music Compositions
More Info
expand_more
Abstract
Music annotation and transcription of music sheets are traditionally performed by experts. Although these processes result in high quality data, the scope of each effort is relatively narrow resulting in highly specialised and specific datasets of annotated music compositions, which leads to a fragmentation in the design efforts for automated tools. In music traditions such as classical music, the shortcomings of current digitization work- flows become even clearer: due to the vast corpus and varying stylistic intricacies, experts tend to have specific knowledge and take up projects that concern very specific periods or composers, limiting our reach in regards to conserving classical music information as a whole.
On the other hand, crowdsourcing has been successfully utilized in other domains for annotating different modalities (text, image, video, audio), despite the unreliable pool of expertise on online platforms. Commercially successful projects have utilized the crowd, which provided annotations of adequate quality. These annotations were later used to fuel machine learning methods that rely on bulks of annotated data to perform automatic classification, regression, and detection tasks. However, due to the complexity of music as an artifact, there are still only a few examples where the crowd was integrated into any form outside of subjective annotation tasks (e.g., indicate the mood of the excerpt).
In this thesis, we tackle this research gap of integrating the crowd in the annotation processes of music compositions. We surveyed current practices on Optical Music Recognition to identify parts where the crowd could assist, alongside proposing hybrid annotation workflows for music compositions. We studied the capabilities of online participants with unknown musical expertise, quantified their musical abilities, and related them to their performance in music annotation tasks. With our goal being to identify ways to expand the preservation efforts for classical music through the assistance of the general public, we investigated potential online sources of music information and prospective participants outside the currently available crowdsourcing platforms. To that end, we studied how composers’ popularity manifests on community-driven platforms through the interactions of music enthusiasts online. We also conducted interviews and focus group discussions with experts and semi-experts, to understand their quality requirements on semantically-rich digital music scores, and identified transcription patterns that could inform our task designs. We finally delivered our system architecture, which combines computer vision and algorithmic scheduling, with microtasks designed to be performed by human annotators in parallel.
Our findings show that with the right methods to quantify the musical competence of a person, paired with careful design of the annotation tasks and interfaces, we can successfully integrate the crowd in the music annotation processes, to generate meaningful and useful information regarding classical music compositions and beyond. This thesis enables future research by showcasing the versatility of the crowd and providing task design methods to accommodate their lack of formal training in the field. It also provides experimental methods to reliably identify how different music composition elements affect the crowd’s performance, alongside proposing user interface elements that can mediate the complexity of the artifacts. Practices such as those presented in this thesis can lead to scaling up our digitization efforts, generating accurate and useful annotations through the crowd, even in such domain-specific and knowledge-intensive topics as classical music compositions.