🤖 AI Summary
Traditional sequential recommendation methods rely solely on self-attention, limiting their ability to comprehensively model the evolution of users’ multidimensional interests. To address this, we propose a pyramid-style MLP-Mixer architecture that jointly models dynamic user interests across three dimensions: behavior types, feature attributes, and temporal periods. Our architecture employs hierarchical stacking to enable cross-behavior sequence modeling, cross-feature embedding interaction, and multi-granularity temporal modeling—thereby overcoming the inherent limitations of single-dimension attention mechanisms. Extensive experiments demonstrate significant improvements in online A/B testing: +0.106% increase in average session duration and +0.0113% rise in active days. The model has been successfully deployed at industrial scale, exhibiting both high scalability and strong business impact. To the best of our knowledge, this is the first work to introduce a pyramid-style Mixer architecture into sequential recommendation, establishing a novel paradigm for joint multidimensional interest modeling.
📝 Abstract
Sequential recommendation, a critical task in recommendation systems, predicts the next user action based on the understanding of the user's historical behaviors. Conventional studies mainly focus on cross-behavior modeling with self-attention based methods while neglecting comprehensive user interest modeling for more dimensions. In this study, we propose a novel sequential recommendation model, Pyramid Mixer, which leverages the MLP-Mixer architecture to achieve efficient and complete modeling of user interests. Our method learns comprehensive user interests via cross-behavior and cross-feature user sequence modeling. The mixer layers are stacked in a pyramid way for cross-period user temporal interest learning. Through extensive offline and online experiments, we demonstrate the effectiveness and efficiency of our method, and we obtain a +0.106% improvement in user stay duration and a +0.0113% increase in user active days in the online A/B test. The Pyramid Mixer has been successfully deployed on the industrial platform, demonstrating its scalability and impact in real-world applications.