Pyramid Mixer: Multi-dimensional Multi-period Interest Modeling for Sequential Recommendation

📅 2025-06-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional sequential recommendation methods rely solely on self-attention, limiting their ability to comprehensively model the evolution of users’ multidimensional interests. To address this, we propose a pyramid-style MLP-Mixer architecture that jointly models dynamic user interests across three dimensions: behavior types, feature attributes, and temporal periods. Our architecture employs hierarchical stacking to enable cross-behavior sequence modeling, cross-feature embedding interaction, and multi-granularity temporal modeling—thereby overcoming the inherent limitations of single-dimension attention mechanisms. Extensive experiments demonstrate significant improvements in online A/B testing: +0.106% increase in average session duration and +0.0113% rise in active days. The model has been successfully deployed at industrial scale, exhibiting both high scalability and strong business impact. To the best of our knowledge, this is the first work to introduce a pyramid-style Mixer architecture into sequential recommendation, establishing a novel paradigm for joint multidimensional interest modeling.

Technology Category

Application Category

📝 Abstract
Sequential recommendation, a critical task in recommendation systems, predicts the next user action based on the understanding of the user's historical behaviors. Conventional studies mainly focus on cross-behavior modeling with self-attention based methods while neglecting comprehensive user interest modeling for more dimensions. In this study, we propose a novel sequential recommendation model, Pyramid Mixer, which leverages the MLP-Mixer architecture to achieve efficient and complete modeling of user interests. Our method learns comprehensive user interests via cross-behavior and cross-feature user sequence modeling. The mixer layers are stacked in a pyramid way for cross-period user temporal interest learning. Through extensive offline and online experiments, we demonstrate the effectiveness and efficiency of our method, and we obtain a +0.106% improvement in user stay duration and a +0.0113% increase in user active days in the online A/B test. The Pyramid Mixer has been successfully deployed on the industrial platform, demonstrating its scalability and impact in real-world applications.
Problem

Research questions and friction points this paper is trying to address.

Model multi-dimensional user interests for sequential recommendation
Improve cross-behavior and cross-feature user sequence modeling
Enhance cross-period temporal interest learning efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

MLP-Mixer architecture for user interest modeling
Pyramid-stacked layers for temporal interest learning
Cross-behavior and cross-feature sequence modeling
🔎 Similar Papers
No similar papers found.
Zhen Gong
Zhen Gong
Bytedance
Recommender SystemComputational Advertising
Zhifang Fan
Zhifang Fan
Alibaba
Natural Language ProcessingInformation RetrievalRecommender System
Hui Lu
Hui Lu
Department of Computer Science and Engineering (CSE), the University of Texas at Arlington (UTA)
Cloud ComputingVirtualizationFile and Storage SystemsComputer NetworksComputer Systems
Q
Qiwei Chen
Bytedance, Shanghai, China
Chenbin Zhang
Chenbin Zhang
Unknown affiliation
L
Lin Guan
Bytedance, Beijing, China
Y
Yuchao Zheng
Bytedance, Hangzhou, China
F
Feng Zhang
Bytedance, Shanghai, China
X
Xiao Yang
Bytedance, Beijing, China
Z
Zuotao Liu
Bytedance, Singapore, Singapore