Disentangling and Generating Modalities for Recommendation in Missing Modality Scenarios

📅 2025-04-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multimodal recommendation systems suffer significant performance degradation under modality missing, primarily because existing methods neglect modality-specific characteristics and lack robust modeling for missing scenarios. To address this, we propose DGMRec—a novel framework that, for the first time, decomposes multimodal features into shared and modality-specific components from an information-theoretic perspective. DGMRec employs information bottleneck–driven feature disentanglement, cross-modal alignment, and user-modality preference–guided conditional generation to dynamically reconstruct missing modality representations. It further incorporates a dedicated multimodal fusion architecture for recommendation. The approach jointly enhances modality specificity and missing-data robustness while enabling cross-modal retrieval. Extensive experiments demonstrate that DGMRec consistently outperforms state-of-the-art methods across diverse missing-ratio settings, cold-start items, and hybrid missing scenarios.

Technology Category

Application Category

📝 Abstract
Multi-modal recommender systems (MRSs) have achieved notable success in improving personalization by leveraging diverse modalities such as images, text, and audio. However, two key challenges remain insufficiently addressed: (1) Insufficient consideration of missing modality scenarios and (2) the overlooking of unique characteristics of modality features. These challenges result in significant performance degradation in realistic situations where modalities are missing. To address these issues, we propose Disentangling and Generating Modality Recommender (DGMRec), a novel framework tailored for missing modality scenarios. DGMRec disentangles modality features into general and specific modality features from an information-based perspective, enabling richer representations for recommendation. Building on this, it generates missing modality features by integrating aligned features from other modalities and leveraging user modality preferences. Extensive experiments show that DGMRec consistently outperforms state-of-the-art MRSs in challenging scenarios, including missing modalities and new item settings as well as diverse missing ratios and varying levels of missing modalities. Moreover, DGMRec's generation-based approach enables cross-modal retrieval, a task inapplicable for existing MRSs, highlighting its adaptability and potential for real-world applications. Our code is available at https://github.com/ptkjw1997/DGMRec.
Problem

Research questions and friction points this paper is trying to address.

Addressing performance drop in multi-modal recommender systems with missing modalities
Disentangling modality features into general and specific representations
Generating missing modality features using aligned features and user preferences
Innovation

Methods, ideas, or system contributions that make the work stand out.

Disentangles modality features into general and specific
Generates missing modality features via aligned features
Enables cross-modal retrieval in missing scenarios
🔎 Similar Papers
No similar papers found.
J
Jiwan Kim
KAIST, Daejeon, Republic of Korea
H
Hongseok Kang
KAIST, Daejeon, Republic of Korea
Sein Kim
Sein Kim
KAIST
Recommender SystemsPersonalizationLarge Language Models
K
Kibum Kim
KAIST, Daejeon, Republic of Korea
Chanyoung Park
Chanyoung Park
Associate Professor, KAIST
Artificial intelligenceGraph data miningRecommender systemAI for Science