IGDMRec: Behavior Conditioned Item Graph Diffusion for Multimodal Recommendation

📅 2025-12-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In multimodal recommendation, semantic item graphs are vulnerable to modality noise and behavior–semantics misalignment, leading to spurious edges and biased recommendations. To address this, we propose Behavior-Guided Diffusion Denoising (BGD), a novel framework that introduces a behavior-conditioned graph diffusion mechanism and classifier-free guidance for diffusion modeling, explicitly injecting user interaction signals into graph structure learning. BGD further integrates contrastive representation enhancement to jointly optimize semantic and behavioral graphs. The method employs a lightweight Conditional Denoising Network (CD-Net), balancing computational efficiency with expressive power. Extensive experiments on four real-world datasets demonstrate significant improvements over state-of-the-art methods. Ablation studies confirm the necessity of each component, while robustness analysis shows strong resilience against diverse noise types.

Technology Category

Application Category

📝 Abstract
Multimodal recommender systems (MRSs) are critical for various online platforms, offering users more accurate personalized recommendations by incorporating multimodal information of items. Structure-based MRSs have achieved state-of-the-art performance by constructing semantic item graphs, which explicitly model relationships between items based on modality feature similarity. However, such semantic item graphs are often noisy due to 1) inherent noise in multimodal information and 2) misalignment between item semantics and user-item co-occurrence relationships, which introduces false links and leads to suboptimal recommendations. To address this challenge, we propose Item Graph Diffusion for Multimodal Recommendation (IGDMRec), a novel method that leverages a diffusion model with classifier-free guidance to denoise the semantic item graph by integrating user behavioral information. Specifically, IGDMRec introduces a Behavior-conditioned Graph Diffusion (BGD) module, incorporating interaction data as conditioning information to guide the denoising of the semantic item graph. Additionally, a Conditional Denoising Network (CD-Net) is designed to implement the denoising process with manageable complexity. Finally, we propose a contrastive representation augmentation scheme that leverages both the denoised item graph and the original item graph to enhance item representations. LL{Extensive experiments on four real-world datasets demonstrate the superiority of IGDMRec over competitive baselines, with robustness analysis validating its denoising capability and ablation studies verifying the effectiveness of its key components.
Problem

Research questions and friction points this paper is trying to address.

Denoise semantic item graphs in multimodal recommendation
Align item semantics with user behavior patterns
Enhance item representations using denoised graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages diffusion model with classifier-free guidance for denoising
Incorporates user behavior to condition graph diffusion process
Uses contrastive augmentation with denoised and original graphs
🔎 Similar Papers
No similar papers found.
Z
Ziyuan Guo
State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an, Shaanxi 710071, China
J
Jie Guo
State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an, Shaanxi 710071, China
Z
Zhenghao Chen
Hangzhou Institute of Technology, Xidian University, Hangzhou 311231, China
Bin Song
Bin Song
Xidian University
multimodal data representation and fusionreinforcement learningrecommendation systems
F
Fei Richard Yu
School of Information Technology, Carleton University, Canada