Multi-modal Adaptive Mixture of Experts for Cold-start Recommendation

Conference Paper (2025)
Author(s)

Van Khang Nguyen (Vietnam National University Hanoi)

Duc Hoang Pham (Vietnam National University Hanoi)

Huy Son Nguyen (TU Delft - Multimedia Computing)

Cam Van Thi Nguyen (Vietnam National University Hanoi)

Hoang Quynh Le (Vietnam National University Hanoi)

Duc Trong Le (Vietnam National University Hanoi)

Research Group
Multimedia Computing
DOI related publication
https://doi.org/10.1145/3746252.3760837
More Info
expand_more
Publication Year
2025
Language
English
Research Group
Multimedia Computing
Bibliographical Note
Green Open Access added to TU Delft Institutional Repository as part of the Taverne amendment. More information about this copyright law amendment can be found at https://www.openaccess.nl. Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.@en
Pages (from-to)
5053-5057
Publisher
ACM
ISBN (electronic)
9798400720406
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Recommendation systems have faced significant challenges in cold-start scenarios, where new items with a limited history of interaction need to be effectively recommended to users. Though multimodal data (e.g., images, text, audio, etc.) offer rich information to address this issue, existing approaches often employ simplistic integration methods such as concatenation, average pooling, or fixed weighting schemes, which fail to capture the complex relationships between modalities. Our study proposes a novel Mixture of Experts framework for multimodal cold-start recommendation (MAMEX), which dynamically leverages latent representation from different modalities. MAMEX utilizes modality-specific expert networks and introduces a learnable gating mechanism that adaptively weights the contribution of each modality based on its content characteristics. This approach enables MAMEX to emphasize the most informative modalities for each item while maintaining robustness when certain modalities are less relevant or missing. Extensive experiments on benchmark datasets show that MAMEX outperforms state-of-the-art models with superior accuracy and adaptability.

Files

3746252.3760837.pdf
(pdf | 3.81 Mb)
License info not available
warning

File under embargo until 10-05-2026