Multi-modal Adaptive Mixture of Experts for Cold-start Recommendation

📅 2025-08-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the insufficient item representation caused by sparse user interactions for new items in cold-start recommendation scenarios, this paper proposes the Multimodal Adaptive Mixture-of-Experts Network (MAMEX). MAMEX employs a learnable gating mechanism to dynamically weight heterogeneous multimodal features—such as image and text embeddings—thereby capturing complex cross-modal dependencies. It further integrates modality-specific experts within a Mixture-of-Experts (MoE) architecture to ensure robust inference under partial modality absence. Unlike static fusion strategies (e.g., concatenation or average pooling), MAMEX enables fine-grained, adaptive multimodal representation fusion. Extensive experiments on multiple public benchmark datasets demonstrate significant improvements in standard recommendation metrics—including Recall@K and NDCG—validating both its effectiveness and generalizability. The source code is publicly available.

Technology Category

Application Category

📝 Abstract
Recommendation systems have faced significant challenges in cold-start scenarios, where new items with a limited history of interaction need to be effectively recommended to users. Though multimodal data (e.g., images, text, audio, etc.) offer rich information to address this issue, existing approaches often employ simplistic integration methods such as concatenation, average pooling, or fixed weighting schemes, which fail to capture the complex relationships between modalities. Our study proposes a novel Mixture of Experts (MoE) framework for multimodal cold-start recommendation, named MAMEX, which dynamically leverages latent representation from different modalities. MAMEX utilizes modality-specific expert networks and introduces a learnable gating mechanism that adaptively weights the contribution of each modality based on its content characteristics. This approach enables MAMEX to emphasize the most informative modalities for each item while maintaining robustness when certain modalities are less relevant or missing. Extensive experiments on benchmark datasets show that MAMEX outperforms state-of-the-art methods in cold-start scenarios, with superior accuracy and adaptability. For reproducibility, the code has been made available on Github https://github.com/L2R-UET/MAMEX.
Problem

Research questions and friction points this paper is trying to address.

Addressing cold-start recommendation for new items with limited interaction history
Improving multimodal data integration beyond simplistic methods like concatenation
Dynamically weighting modalities adaptively based on content characteristics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Mixture of Experts framework
Learnable gating for modality weighting
Modality-specific expert networks integration
🔎 Similar Papers
No similar papers found.