🤖 AI Summary
This work addresses the limitations of existing sequential recommendation methods, which rely on sparse co-purchase statistics and often conflate spurious co-occurrences with genuine complementary relationships while failing to capture fine-grained semantics. To overcome these issues, we propose CAST, a novel framework that introduces a semantic-level dynamic transition modeling paradigm. CAST preserves fine-grained information through discrete semantic encoding, directly models semantic evolution between items via a dedicated semantic transition module, and incorporates a complementary prior—validated by large language models—to guide an attention mechanism that effectively distinguishes complementary patterns from mere co-occurrences. Extensive experiments demonstrate that CAST significantly outperforms state-of-the-art methods across multiple e-commerce datasets, achieving up to 17.6% higher Recall and 16.0% improvement in NDCG, while accelerating training by 65×.
📝 Abstract
Sequential Recommendation (SR) aims to predict the next interaction of a user based on their behavior sequence, where complementary relations often provide essential signals for predicting the next item. However, mainstream models relying on sparse co-purchase statistics often mistake spurious correlations (e.g., due to popularity bias) for true complementary relations. Identifying true complementary relations requires capturing the fine-grained item semantics (e.g., specifications) that simple cooccurrence statistics would be unable to model. While recent semantics-based methods utilize discrete semantic codes to represent items, they typically aggregate semantic codes into coarse item representations. This aggregation process blurs specific semantic details required to identify complementarity. To address these critical limitations and effectively leverage semantics for capturing reliable complementary relations, we propose a Complementary-Aware Semantic Transition (CAST) framework that introduces a new modeling paradigm built upon semantic-level transitions. Specifically, a semantic-level transition module is designed to model dynamic transitions directly in the discrete semantic code space, effectively capturing fine-grained semantic dependencies often lost in aggregated item representations. Then, a complementary prior injection module is designed to incorporate LLM-verified complementary priors into the attention mechanism, thereby prioritizing complementary patterns over co-occurrence statistics. Experiments on multiple e-commerce datasets demonstrate that CAST consistently outperforms the state-of-the-art approaches, achieving up to 17.6% Recall and 16.0% NDCG gains with 65x training acceleration. This validates its effectiveness and efficiency in uncovering latent item complementarity beyond statistics. The code will be released upon acceptance.