Knowledge Soft Integration for Multimodal Recommendation

📅 2023-05-12
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the “curse of knowledge” in multimodal recommendation—where misalignment between modality-specific feature extraction and downstream task objectives leads to isolated features, model bias, and performance degradation—this paper proposes the Knowledge Soft Integration (KSI) framework. KSI innovatively unifies structure injection and semantic soft integration: it employs a refined graph neural network (RGNN) for task-aware structural regularization to inject knowledge into representations; and introduces self-supervised retrieval-guided multimodal feature disentanglement and semantic alignment to enable controllable, progressive fusion. Extensive experiments on three benchmark datasets demonstrate consistent and significant improvements over state-of-the-art methods. Ablation studies confirm that both modules contribute stable performance gains. Quantitative analysis shows a 23% reduction in representation redundancy and a 19% enhancement in semantic discriminability, validating KSI’s effectiveness in harmonizing multimodal knowledge for recommendation.
📝 Abstract
One of the main challenges in modern recommendation systems is how to effectively utilize multimodal content to achieve more personalized recommendations. Despite various proposed solutions, most of them overlook the mismatch between the knowledge gained from independent feature extraction processes and downstream recommendation tasks. Specifically, multimodal feature extraction processes do not incorporate prior knowledge relevant to recommendation tasks, while recommendation tasks often directly use these multimodal features as side information. This mismatch can lead to model fitting biases and performance degradation, which this paper refers to as the extit{curse of knowledge} problem. To address this issue, we propose using knowledge soft integration to balance the utilization of multimodal features and the curse of knowledge problem it brings about. To achieve this, we put forward a Knowledge Soft Integration framework for the multimodal recommendation, abbreviated as KSI, which is composed of the Structure Efficiently Injection (SEI) module and the Semantic Soft Integration (SSI) module. In the SEI module, we model the modality correlation between items using Refined Graph Neural Network (RGNN), and introduce a regularization term to reduce the redundancy of user/item representations. In the SSI module, we design a self-supervised retrieval task to further indirectly integrate the semantic knowledge of multimodal features, and enhance the semantic discrimination of item representations. Extensive experiments on three benchmark datasets demonstrate the superiority of KSI and validate the effectiveness of its two modules.
Problem

Research questions and friction points this paper is trying to address.

Addressing knowledge misalignment between multimodal feature extraction and recommendation tasks
Mitigating biases from multimodal content integration in personalized recommendation systems
Reducing representation redundancy while enhancing semantic distinctiveness in recommendations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Soft Integration balances multimodal feature utilization
Structure Efficient Injection module minimizes representation redundancy
Semantic Soft Integration enhances item semantic distinctiveness
🔎 Similar Papers
No similar papers found.
K
Kai Ouyang
Tsinghua University, China
C
Chen Tang
MMLab, The Chinese University of Hong Kong, Hong Kong
W
Wenhao Zheng
Microsoft AI Asia, China
X
Xiangjin Xie
Tsinghua University, China
X
Xuanji Xiao
Jian Dong
Jian Dong
Shopee
Computer VisionMachine Learning
Haitao Zheng
Haitao Zheng
Neubauer Professor of Computer Science, University of Chicago
Mobile ComputingSecurity and Privacy
Z
Zhi Wang
Tsinghua University, China