The power of deep without going deep? A study of HDPGMM music representation learning

None, None; None, None

The power of deep without going deep? A study of HDPGMM music representation learning

Conference Paper (2022)

Author(s)

Jaehun Kim (TU Delft - Multimedia Computing)

CCS Liem (TU Delft - Multimedia Computing)

Copyright

To reference this document use:

https://resolver.tudelft.nl/uuid:6485898b-a36b-41b4-a958-b917339a7ef9

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Pages (from-to)

116 - 124

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In the previous decade, Deep Learning (DL) has proven to be one of the most effective machine learning methods to tackle a wide range of Music Information Retrieval (MIR) tasks. It offers highly expressive learning capacity that can fit any music representation needed for MIR-relevant downstream tasks. However, it has been criticized for sacrificing interpretability. On the other hand, the Bayesian nonparametric (BN) approach promises similar positive properties as DL, such as high flexibility, while being robust to overfitting and preserving interpretability. Therefore, the primary motivation of this work is to explore the potential of Bayesian nonparametric models in comparison to DL models for music representation learning. More specifically, we assess the music representation learned from the Hierarchical Dirichlet Process Gaussian Mixture Model (HDPGMM), an infinite mixture model based on the Bayesian nonparametric approach, to MIR tasks, including classification, auto-tagging, and recommendation. The experimental result suggests that the HDPGMM music representation can outperform DL representations in certain scenarios, and overall comparable.

Files

000013.pdf

(pdf | 0.386 Mb)