Rethinking Contrastive Learning in Session-based Recommendation

📅 2025-06-05

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

To address three key challenges in session-based recommendation—(i) separate modeling of item-level and session-level sparsity, (ii) semantic inconsistency in data augmentation, and (iii) undifferentiated contribution of positive and negative samples—this paper proposes MACL, a Multimodal Adaptive Contrastive Learning framework. MACL is the first to jointly model dual-granularity sparsity; it introduces a semantics-aware multimodal augmentation strategy leveraging image and text features to ensure cross-view consistency; and it incorporates an adaptive contrastive loss that dynamically weights the discriminative strength of positive and negative samples. The framework synergistically integrates graph neural networks and Transformers for multimodal collaborative representation learning. Extensive experiments on three real-world datasets demonstrate that MACL achieves significant improvements in Recall@20, outperforming state-of-the-art methods by 3.2%–5.8%, thereby validating the effectiveness of multimodal semantic enhancement and adaptive discrimination.

Technology Category

Application Category

📝 Abstract

Session-based recommendation aims to predict intents of anonymous users based on limited behaviors. With the ability in alleviating data sparsity, contrastive learning is prevailing in the task. However, we spot that existing contrastive learning based methods still suffer from three obstacles: (1) they overlook item-level sparsity and primarily focus on session-level sparsity; (2) they typically augment sessions using item IDs like crop, mask and reorder, failing to ensure the semantic consistency of augmented views; (3) they treat all positive-negative signals equally, without considering their varying utility. To this end, we propose a novel multi-modal adaptive contrastive learning framework called MACL for session-based recommendation. In MACL, a multi-modal augmentation is devised to generate semantically consistent views at both item and session levels by leveraging item multi-modal features. Besides, we present an adaptive contrastive loss that distinguishes varying contributions of positive-negative signals to improve self-supervised learning. Extensive experiments on three real-world datasets demonstrate the superiority of MACL over state-of-the-art methods.

Problem

Research questions and friction points this paper is trying to address.

Addresses item-level sparsity in session-based recommendation

Ensures semantic consistency in session augmentation methods

Differentiates utility of positive-negative contrastive signals

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-modal augmentation for semantic consistency

Adaptive contrastive loss for signal differentiation

Item and session level sparsity alleviation

🔎 Similar Papers

No similar papers found.