π€ AI Summary
Existing multimodal recommendation systems suffer from ineffective integration of multi-source modality information during negative sampling and inadequate balanced modeling of modality-specific influences, resulting in weak discriminative capability of negative samples and insufficient recommendation diversity. To address these issues, we propose NegGenβa novel framework that pioneers a multimodal large language model (MLLM)-based negative sample generation paradigm. Specifically, NegGen designs three cross-modal prompt templates to elicit fine-grained semantic contrast, incorporates a causal learning module to disentangle critical features from confounding attributes, and enforces cross-modal representation alignment to ensure modality balance. Extensive experiments on multiple real-world datasets demonstrate that NegGen consistently outperforms state-of-the-art methods, achieving superior performance in both recommendation accuracy and negative sample quality. These results validate its effectiveness in enhancing model discriminability and recommendation diversity.
π Abstract
Multi-modal recommender systems (MMRS) have gained significant attention due to their ability to leverage information from various modalities to enhance recommendation quality. However, existing negative sampling techniques often struggle to effectively utilize the multi-modal data, leading to suboptimal performance. In this paper, we identify two key challenges in negative sampling for MMRS: (1) producing cohesive negative samples contrasting with positive samples and (2) maintaining a balanced influence across different modalities. To address these challenges, we propose NegGen, a novel framework that utilizes multi-modal large language models (MLLMs) to generate balanced and contrastive negative samples. We design three different prompt templates to enable NegGen to analyze and manipulate item attributes across multiple modalities, and then generate negative samples that introduce better supervision signals and ensure modality balance. Furthermore, NegGen employs a causal learning module to disentangle the effect of intervened key features and irrelevant item attributes, enabling fine-grained learning of user preferences. Extensive experiments on real-world datasets demonstrate the superior performance of NegGen compared to state-of-the-art methods in both negative sampling and multi-modal recommendation.