🤖 AI Summary
Multimodal recommendation systems (MRS) often suffer from modality incompleteness—caused by missing images, incomplete text, or heterogeneous user content—leading to degraded model robustness and generalization. To address this, we propose a novel framework integrating invariant learning with the information bottleneck principle. First, invariant risk minimization (IRM) is employed to learn modality-invariant user preference representations shared across modalities. Second, the information bottleneck is applied to compress modality-specific features, thereby suppressing noise and enhancing representation compactness. Third, a missingness-aware fusion module is designed to adaptively preserve critical recommendation signals under partial modality availability. To our knowledge, this is the first work to jointly leverage IRM and the information bottleneck for robust MRS modeling. Extensive experiments on three real-world datasets demonstrate significant improvements over state-of-the-art methods, particularly under diverse modality missingness patterns, with enhanced stability and generalization performance.
📝 Abstract
Multimodal recommender systems (MRS) improve recommendation performance by integrating diverse semantic information from multiple modalities. However, the assumption of the availability of all modalities rarely holds in practice due to missing images, incomplete descriptions, or inconsistent user content. These challenges significantly degrade the robustness and generalization capabilities of current models. To address these challenges, we introduce a novel method called extbf{I$^3$-MRec}, which uses extbf{I}nvariant learning with extbf{I}nformation bottleneck principle for extbf{I}ncomplete extbf{M}odality extbf{Rec}ommendation. To achieve robust performance in missing modality scenarios, I$^3$-MRec enforces two pivotal properties: (i) cross-modal preference invariance, which ensures consistent user preference modeling across varying modality environments, and (ii) compact yet effective modality representation, which filters out task-irrelevant modality information while maximally preserving essential features relevant to recommendation. By treating each modality as a distinct semantic environment, I$^3$-MRec employs invariant risk minimization (IRM) to learn modality-specific item representations. In parallel, a missing-aware fusion module grounded in the Information Bottleneck (IB) principle extracts compact and effective item embeddings by suppressing modality noise and preserving core user preference signals. Extensive experiments conducted on three real-world datasets demonstrate that I$^3$-MRec consistently outperforms existing state-of-the-art MRS methods across various modality-missing scenarios, highlighting its effectiveness and robustness in practical applications. The code and processed datasets are released at https://github.com/HuilinChenJN/I3-MRec.