External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation

📅 2025-02-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deploying trillion-parameter foundation models for online advertising recommendation faces critical challenges in meeting strict low-latency inference requirements and adapting to dynamic data distributions in industrial settings. Method: This paper proposes ExFM, an external distillation framework featuring: (1) a teacher foundation model reuse mechanism enabling cross-distribution knowledge transfer; (2) co-designed auxiliary heads and student adapters for computational cost sharing; and (3) a collaborative architecture integrating a Data Augmentation System (DAS) with streaming self-adaptive modeling across base and vertical models. Results: Evaluated on both industrial and public benchmarks, ExFM significantly improves recommendation accuracy while maintaining controllable inference latency and reducing training overhead by over 40%. It establishes a novel paradigm for efficient deployment of ultra-large-scale models in real-time recommendation systems.

Technology Category

Application Category

📝 Abstract
Ads recommendation is a prominent service of online advertising systems and has been actively studied. Recent studies indicate that scaling-up and advanced design of the recommendation model can bring significant performance improvement. However, with a larger model scale, such prior studies have a significantly increasing gap from industry as they often neglect two fundamental challenges in industrial-scale applications. First, training and inference budgets are restricted for the model to be served, exceeding which may incur latency and impair user experience. Second, large-volume data arrive in a streaming mode with data distributions dynamically shifting, as new users/ads join and existing users/ads leave the system. We propose the External Large Foundation Model (ExFM) framework to address the overlooked challenges. Specifically, we develop external distillation and a data augmentation system (DAS) to control the computational cost of training/inference while maintaining high performance. We design the teacher in a way like a foundation model (FM) that can serve multiple students as vertical models (VMs) to amortize its building cost. We propose Auxiliary Head and Student Adapter to mitigate the data distribution gap between FM and VMs caused by the streaming data issue. Comprehensive experiments on internal industrial-scale applications and public datasets demonstrate significant performance gain by ExFM.
Problem

Research questions and friction points this paper is trying to address.

Efficiently serve trillion-parameter models
Control training and inference budgets
Mitigate data distribution shifts
Innovation

Methods, ideas, or system contributions that make the work stand out.

External distillation reduces computational cost
Data augmentation system maintains high performance
Auxiliary Head mitigates data distribution gap
Mingfu Liang
Mingfu Liang
Meta | Northwestern University
Machine LearningComputer VisionIncremental LearningContinual Learning
X
Xi Liu
AI at Meta
R
Rong Jin
AI at Meta
B
Boyang Liu
AI at Meta
Q
Qiuling Suo
AI at Meta
Qinghai Zhou
Qinghai Zhou
University of Illinois Urbana-Champaign
Data miningmachine learning
Song Zhou
Song Zhou
AI at Meta
Laming Chen
Laming Chen
Facebook
Recommender SystemOptimizationCompressive sensing
H
Hua Zheng
AI at Meta
Z
Zhiyuan Li
AI at Meta
Shali Jiang
Shali Jiang
AI at Meta
Jiyan Yang
Jiyan Yang
Stanford University
X
Xiaozhen Xia
AI at Meta
F
Fan Yang
AI at Meta
Yasmine Badr
Yasmine Badr
Student, Electrical Engineering Department, UCLA
CAD for VLSIElectronic Design Automation
E
Ellie Wen
AI at Meta
S
Shuyu Xu
AI at Meta
H
Hansey Chen
AI at Meta
Z
Zhengyu Zhang
AI at Meta
J
Jade Nie
AI at Meta
C
Chunzhi Yang
AI at Meta
Z
Zhichen Zeng
University of Illinois Urbana-Champaign
W
Weilin Zhang
AI at Meta
Xingliang Huang
Xingliang Huang
AI at Meta
Qianru Li
Qianru Li
AI at Meta
S
Shiquan Wang
AI at Meta
E
Evelyn Lyu
AI at Meta
W
Wenjing Lu
AI at Meta
R
Rui Zhang
AI at Meta
Wenjun Wang
Wenjun Wang
Tianjin University
Data MiningSocial NetworkComplex NetworkSmart City
J
Jason Rudy
AI at Meta
M
Mengyue Hang
AI at Meta
K
Kai Wang
AI at Meta
Y
Yinbin Ma
AI at Meta
S
Shuaiwen Wang
AI at Meta
Sihan Zeng
Sihan Zeng
JPMorgan AI Research
optimizationstochastic approximationreinforcement learninggame theory
T
Tongyi Tang
AI at Meta
X
Xiaohan Wei
AI at Meta
L
Longhao Jin
AI at Meta
J
Jamey Zhang
AI at Meta
M
Marcus Chen
AI at Meta
J
Jiayi Zhang
AI at Meta
A
Angie Huang
AI at Meta
C
Chi Zhang
AI at Meta
Zhengli Zhao
Zhengli Zhao
Ph.D. in Computer Science
Machine LearningDeep Learning
J
Jared Yang
AI at Meta
Q
Qiang Jin
AI at Meta
X
Xian Chen
AI at Meta
A
Amit Anand Amlesahwaram
AI at Meta
L
Lexi Song
AI at Meta
Liang Luo
Liang Luo
University of Washington
Systems for Machine LearningComputer SystemsComputer ArchitectureMachine Learning for Systems
Yuchen Hao
Yuchen Hao
AI at Meta
N
Nan Xiao
AI at Meta
Y
Yavuz Yetim
AI at Meta
L
Luoshang Pan
AI at Meta
G
Gaoxiang Liu
AI at Meta
Yuxi Hu
Yuxi Hu
Graz University of Technology
Computer Vision3D Reconstruction
Y
Yuzhen Huang
AI at Meta
J
Jackie Xu
AI at Meta
R
Rich Zhu
AI at Meta
X
Xin Zhang
AI at Meta
Y
Yiqun Liu
AI at Meta
H
Hang Yin
AI at Meta
Y
Yuxin Chen
AI at Meta
B
Buyun Zhang
AI at Meta
Xiaoyi Liu
Xiaoyi Liu
Research Scientist, Meta AI
Deep LearningMachine LearningOptimization Design
S
Sylvia Wang
AI at Meta
Wenguang Mao
Wenguang Mao
AI at Meta
Zhijing Li
Zhijing Li
Facebook
Data-driven network designMachine learning
Q
Qin Huang
AI at Meta
C
Chonglin Sun
AI at Meta
S
Shupin Mao
AI at Meta
J
Jingzheng Qin
AI at Meta
P
Peggy Yao
AI at Meta
J
Jaeyoon Choi
AI at Meta
B
Bin Gao
AI at Meta
Ernest Wang
Ernest Wang
AI at Meta
L
Lei Zhang
AI at Meta
Wen-Yen Chen
Wen-Yen Chen
Meta
Data MiningMachine LearningDistributed SystemsRecommendation Systems
T
Ted Lee
AI at Meta
J
Jay Zha
AI at Meta
Y
Yi Meng
AI at Meta
A
Alex Gong
AI at Meta
E
Edison Gao
AI at Meta
A
Alireza Vahdatpour
AI at Meta
Y
Yiping Han
AI at Meta
Y
Yantao Yao
AI at Meta
T
Toshinari Kureha
AI at Meta
S
Shuo Chang
AI at Meta
M
Musharaf Sultan
AI at Meta
J
John Bocharov
AI at Meta
S
Sagar Chordia
AI at Meta
X
Xiaorui Gan
AI at Meta
P
Peng Sun
AI at Meta
R
Rocky Liu
AI at Meta
Bo Long
Bo Long
Machine Learning
data miningmachine learning
W
Wenlin Chen
AI at Meta
S
Santanu Kolay
AI at Meta
Huayu Li
Huayu Li
University of Arizona
Machine learninghealthcare informaticsmedical time seriesdigital health