🤖 AI Summary
To address the unreliability of positive sample pairs in sequential recommendation caused by sparse user-item interactions, this paper proposes a semantic retrieval-augmented dual-path contrastive learning framework. Methodologically, it introduces a novel cross-sequence (user-level) and intra-sequence (item-level) collaborative retrieval mechanism integrated with large language model (LLM)-based semantic encoding: learnable semantic retrieval generates high-confidence user-level positives, while semantic-driven item substitution constructs robust intra-sequence positives; both are jointly optimized via contrastive loss. Unlike prior approaches, the framework avoids reliance on sparse collaborative signals or random perturbations and is plug-and-play. Extensive experiments on four public benchmarks demonstrate significant improvements in Recall@20 and NDCG@20, validating the effectiveness of semantic-guided positive sample construction for preference modeling.
📝 Abstract
Sequential recommendation aims to model user preferences based on historical behavior sequences, which is crucial for various online platforms. Data sparsity remains a significant challenge in this area as most users have limited interactions and many items receive little attention. To mitigate this issue, contrastive learning has been widely adopted. By constructing positive sample pairs from the data itself and maximizing their agreement in the embedding space,it can leverage available data more effectively. Constructing reasonable positive sample pairs is crucial for the success of contrastive learning. However, current approaches struggle to generate reliable positive pairs as they either rely on representations learned from inherently sparse collaborative signals or use random perturbations which introduce significant uncertainty. To address these limitations, we propose a novel approach named Semantic Retrieval Augmented Contrastive Learning (SRA-CL), which leverages semantic information to improve the reliability of contrastive samples. SRA-CL comprises two main components: (1) Cross-Sequence Contrastive Learning via User Semantic Retrieval, which utilizes large language models (LLMs) to understand diverse user preferences and retrieve semantically similar users to form reliable positive samples through a learnable sample synthesis method; and (2) Intra-Sequence Contrastive Learning via Item Semantic Retrieval, which employs LLMs to comprehend items and retrieve similar items to perform semantic-based item substitution, thereby creating semantically consistent augmented views for contrastive learning. SRA-CL is plug-and-play and can be integrated into standard sequential recommendation models. Extensive experiments on four public datasets demonstrate the effectiveness and generalizability of the proposed approach.