UniMixer: A Unified Architecture for Scaling Laws in Recommendation Systems

๐Ÿ“… 2026-04-01
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the scalability limitations of existing recommender systems, which stem from architectural fragmentation and the absence of a unified framework among dominant approachesโ€”such as attention mechanisms, TokenMixer, and factorization machines. To bridge this gap, we propose UniMixer, a novel parameterized feature mixing module that reformulates rule-based TokenMixer into a learnable, end-to-end optimizable component. UniMixer unifies the three major scalability paradigms for the first time and eliminates the rigid coupling between the number of heads and tokens inherent in conventional TokenMixer designs. We further introduce UniMixing-Lite, a lightweight variant to enhance computational efficiency. Extensive experiments demonstrate that UniMixer substantially outperforms state-of-the-art methods in recommendation accuracy while significantly reducing both model parameters and computational overhead.
๐Ÿ“ Abstract
In recent years, the scaling laws of recommendation models have attracted increasing attention, which govern the relationship between performance and parameters/FLOPs of recommenders. Currently, there are three mainstream architectures for achieving scaling in recommendation models, namely attention-based, TokenMixer-based, and factorization-machine-based methods, which exhibit fundamental differences in both design philosophy and architectural structure. In this paper, we propose a unified scaling architecture for recommendation systems, namely \textbf{UniMixer}, to improve scaling efficiency and establish a unified theoretical framework that unifies the mainstream scaling blocks. By transforming the rule-based TokenMixer to an equivalent parameterized structure, we construct a generalized parameterized feature mixing module that allows the token mixing patterns to be optimized and learned during model training. Meanwhile, the generalized parameterized token mixing removes the constraint in TokenMixer that requires the number of heads to be equal to the number of tokens. Furthermore, we establish a unified scaling module design framework for recommender systems, which bridges the connections among attention-based, TokenMixer-based, and factorization-machine-based methods. To further boost scaling ROI, a lightweight UniMixing module is designed, \textbf{UniMixing-Lite}, which further compresses the model parameters and computational cost while significantly improve the model performance. The scaling curves are shown in the following figure. Extensive offline and online experiments are conducted to verify the superior scaling abilities of \textbf{UniMixer}.
Problem

Research questions and friction points this paper is trying to address.

scaling laws
recommendation systems
unified architecture
model scaling
feature mixing
Innovation

Methods, ideas, or system contributions that make the work stand out.

UniMixer
scaling laws
parameterized token mixing
unified architecture
recommendation systems
๐Ÿ”Ž Similar Papers
Mingming Ha
Mingming Ha
Ant Group
Reinforcement LearningAdaptive Dynamic ProgrammingOptimal ControlArtificial Neural Network
G
Guanchen Wang
Kuaishou Technology, Beijing, China
L
Linxun Chen
Kuaishou Technology, Beijing, China
X
Xuan Rao
Kuaishou Technology, Beijing, China
Y
Yuexin Shi
Kuaishou Technology, Beijing, China
T
Tianbao Ma
Kuaishou Technology, Beijing, China
Z
Zhaojie Liu
Kuaishou Technology, Beijing, China
Y
Yunqian Fan
Kuaishou Technology, Beijing, China
Z
Zilong Lu
Kuaishou Technology, Beijing, China
Yanan Niu
Yanan Niu
Unknown affiliation
recommender system
H
Han Li
Kuaishou Technology, Beijing, China
Kun Gai
Kun Gai
Senior Director & Researcher, Alibaba Group
Machine LearningComputational Advertising