🤖 AI Summary
To address the challenge of modeling long user behavior sequences—common in short-video platforms—where existing RNN- and Transformer-based models struggle to balance computational efficiency with effective long-range dependency capture, this paper proposes SSD4Rec: the first efficient sequential recommendation model leveraging Structured State Space Duality (SSD). Innovatively adapting the Mamba architecture, SSD4Rec introduces a sequence register mechanism and bidirectional SSD blocks, enabling real-time modeling of arbitrarily long sequences (L → ∞) while preserving hardware-friendly linear time complexity O(L). Extensive experiments on four benchmark datasets demonstrate consistent superiority over state-of-the-art methods. Moreover, SSD4Rec achieves significantly faster inference than Transformer- and RNN-based baselines. By unifying theoretical rigor with engineering practicality, SSD4Rec establishes a novel paradigm for long-sequence recommendation.
📝 Abstract
Sequential recommendation methods are crucial in modern recommender systems for their remarkable capability to understand a user's changing interests based on past interactions. However, a significant challenge faced by current methods (e.g., RNN- or Transformer-based models) is to effectively and efficiently capture users' preferences by modeling long behavior sequences, which impedes their various applications like short video platforms where user interactions are numerous. Recently, an emerging architecture named Mamba, built on state space models (SSM) with efficient hardware-aware designs, has showcased the tremendous potential for sequence modeling, presenting a compelling avenue for addressing the challenge effectively. Inspired by this, we propose a novel generic and efficient sequential recommendation backbone, SSD4Rec, which explores the seamless adaptation of Mamba for sequential recommendations. Specifically, SSD4Rec marks the variable- and long-length item sequences with sequence registers and processes the item representations with bidirectional Structured State Space Duality (SSD) blocks. This not only allows for hardware-aware matrix multiplication but also empowers outstanding capabilities in variable-length and long-range sequence modeling. Extensive evaluations on four benchmark datasets demonstrate that the proposed model achieves state-of-the-art performance while maintaining near-linear scalability with user sequence length. Our code is publicly available at https://github.com/ZhangYifeng1995/SSD4Rec.