Kunlun: Establishing Scaling Laws for Massive-Scale Recommendation Systems through Unified Architecture Design

📅 2026-02-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of predictable scaling laws in current recommender systems, which is primarily hindered by computational inefficiency and suboptimal resource allocation—particularly when processing user history and contextual features. To overcome these limitations, the authors propose a unified architecture that synergistically combines efficient low-level modules, including Generalized Dot-Product Attention (GDPA), Hierarchical Seed Pooling (HSP), and sliding-window attention, with high-level strategies such as Computation Skip and event-level personalization. This co-design significantly enhances FLOPs utilization and scaling efficiency. Evaluated on NVIDIA B200 GPUs, the approach improves FLOPs utilization from 17% to 37% and doubles scaling efficiency compared to existing methods. The proposed techniques have been deployed in Meta’s core advertising models, yielding substantial production gains.

Technology Category

Application Category

📝 Abstract
Deriving predictable scaling laws that govern the relationship between model performance and computational investment is crucial for designing and allocating resources in massive-scale recommendation systems. While such laws are established for large language models, they remain challenging for recommendation systems, especially those processing both user history and context features. We identify poor scaling efficiency as the main barrier to predictable power-law scaling, stemming from inefficient modules with low Model FLOPs Utilization (MFU) and suboptimal resource allocation. We introduce Kunlun, a scalable architecture that systematically improves model efficiency and resource allocation. Our low-level optimizations include Generalized Dot-Product Attention (GDPA), Hierarchical Seed Pooling (HSP), and Sliding Window Attention. Our high-level innovations feature Computation Skip (CompSkip) and Event-level Personalization. These advances increase MFU from 17% to 37% on NVIDIA B200 GPUs and double scaling efficiency over state-of-the-art methods. Kunlun is now deployed in major Meta Ads models, delivering significant production impact.
Problem

Research questions and friction points this paper is trying to address.

scaling laws
recommendation systems
computational investment
model performance
resource allocation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Scaling Laws
Model FLOPs Utilization
Unified Architecture
Computation Skip
Event-level Personalization
🔎 Similar Papers
No similar papers found.
Bojian Hou
Bojian Hou
Meta
Machine LearningArtificial IntelligenceTrustworthy (Gen)AILarge Language ModelHealthTech
Xiaolong Liu
Xiaolong Liu
Meta
Recommender SystemsPersonalizationAgent
Xiaoyi Liu
Xiaoyi Liu
Research Scientist, Meta AI
Deep LearningMachine LearningOptimization Design
J
Jiaqi Xu
Meta Platforms, Inc., Menlo Park, CA, USA.
Yasmine Badr
Yasmine Badr
Student, Electrical Engineering Department, UCLA
CAD for VLSIElectronic Design Automation
M
Mengyue Hang
Meta Platforms, Inc., Menlo Park, CA, USA.
S
Sudhanshu Chanpuriya
Meta Platforms, Inc., Menlo Park, CA, USA.
J
Junqing Zhou
Meta Platforms, Inc., Menlo Park, CA, USA.
Y
Yuhang Yang
Meta Platforms, Inc., Menlo Park, CA, USA.
H
Han Xu
Meta Platforms, Inc., Menlo Park, CA, USA.
Q
Qiuling Suo
Meta Platforms, Inc., Menlo Park, CA, USA.
Laming Chen
Laming Chen
Facebook
Recommender SystemOptimizationCompressive sensing
Yuxi Hu
Yuxi Hu
Graz University of Technology
Computer Vision3D Reconstruction
J
Jiasheng Zhang
Meta Platforms, Inc., Menlo Park, CA, USA.
Huaqing Xiong
Huaqing Xiong
PhD, the Ohio State University
Reinforcement learningNonconvex optimization
Yuzhen Huang
Yuzhen Huang
MetaThe Chinese University of Hong Kong
Large-scale Machine Learning SystemNetwork AnalysisCluster Scheduling.
C
Chao Chen
Meta Platforms, Inc., Menlo Park, CA, USA.
Yue Dong
Yue Dong
University of California Riverside
Artificial IntelligenceNatural Language ProcessingMachine LearningLLM Security
Yi Yang
Yi Yang
Meta
Natural Language ProcessingMachine LearningArtificial Intelligence
S
Shuo Chang
Meta Platforms, Inc., Menlo Park, CA, USA.
X
Xiaorui Gan
Meta Platforms, Inc., Menlo Park, CA, USA.
Wenlin Chen
Wenlin Chen
Meta Platforms
Machine LearningData MiningArtificial Intelligence
S
Santanu Kolay
Meta Platforms, Inc., Menlo Park, CA, USA.
D
Darren Liu
Meta Platforms, Inc., Menlo Park, CA, USA.
J
Jade Nie
OpenAI, San Francisco, CA, USA.