Request-Only Optimization for Recommendation Systems

📅 2025-07-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low storage and training efficiency of large-scale deep learning recommendation models (DLRMs) under trillion-parameter scales and massive log data, this paper proposes a “request-level optimization paradigm,” the first to treat user requests—not individual samples—as the fundamental training unit. This paradigm jointly optimizes data, model, and system through request-granular data storage, intrinsic feature deduplication, communication compression, and a request-aware neural network architecture. Compared to conventional sample-level training, it significantly reduces log redundancy and storage overhead, enables emerging architectures such as generative recommendation, improves model quality in trillion-FLOP-scale scenarios, and cuts training resource consumption by over 30%.

Technology Category

Application Category

📝 Abstract
Deep Learning Recommendation Models (DLRMs) represent one of the largest machine learning applications on the planet. Industry-scale DLRMs are trained with petabytes of recommendation data to serve billions of users every day. To utilize the rich user signals in the long user history, DLRMs have been scaled up to unprecedented complexity, up to trillions of floating-point operations (TFLOPs) per example. This scale, coupled with the huge amount of training data, necessitates new storage and training algorithms to efficiently improve the quality of these complex recommendation systems. In this paper, we present a Request-Only Optimizations (ROO) training and modeling paradigm. ROO simultaneously improves the storage and training efficiency as well as the model quality of recommendation systems. We holistically approach this challenge through co-designing data (i.e., request-only data), infrastructure (i.e., request-only based data processing pipeline), and model architecture (i.e., request-only neural architectures). Our ROO training and modeling paradigm treats a user request as a unit of the training data. Compared with the established practice of treating a user impression as a unit, our new design achieves native feature deduplication in data logging, consequently saving data storage. Second, by de-duplicating computations and communications across multiple impressions in a request, this new paradigm enables highly scaled-up neural network architectures to better capture user interest signals, such as Generative Recommenders (GRs) and other request-only friendly architectures.
Problem

Research questions and friction points this paper is trying to address.

Optimizes storage and training for large-scale recommendation systems
Enhances model quality with request-only data and architectures
Reduces computational overhead by deduplicating features and operations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Request-Only Optimizations (ROO) paradigm
Request-only data and processing pipeline
Request-only neural architectures for efficiency
🔎 Similar Papers
No similar papers found.
L
Liang Guo
Meta Platforms, Inc., USA
W
Wei Li
Meta Platforms, Inc., USA
L
Lucy Liao
Meta Platforms, Inc., USA
H
Huihui Cheng
Meta Platforms, Inc., USA
R
Rui Zhang
Meta Platforms, Inc., USA
Y
Yu Shi
Meta Platforms, Inc., USA
Yueming Wang
Yueming Wang
Zhejiang University
Brain-computer InterfacePattern recognitionmachine learningneural signal processing
Y
Yanzun Huang
Meta Platforms, Inc., USA
Keke Zhai
Keke Zhai
Unknown affiliation
HPCparallel computing
P
Pengchao Wang
Meta Platforms, Inc., USA
T
Timothy Shi
Meta Platforms, Inc., USA
X
Xuan Cao
Meta Platforms, Inc., USA
S
Shengzhi Wang
Meta Platforms, Inc., USA
Renqin Cai
Renqin Cai
University of Virginia
Fairness Transparency RobustnessInformation RetrievalWeb SearchRecommender
Z
Zhaojie Gong
Meta Platforms, Inc., USA
O
Omkar Vichare
Meta Platforms, Inc., USA
R
Rui Jian
Meta Platforms, Inc., USA
L
Leon Gao
Meta Platforms, Inc., USA
S
Shiyan Deng
Meta Platforms, Inc., USA
X
Xingyu Liu
Meta Platforms, Inc., USA
X
Xiongfei Zhang
Meta Platforms, Inc., USA
F
Fu Li
Meta Platforms, Inc., USA
Wenlei Xie
Wenlei Xie
AI Startup
Parallel and Distributed Computing
Bin Wen
Bin Wen
快手
MLLM
R
Rui Li
Meta Platforms, Inc., USA
X
Xing Liu
Meta Platforms, Inc., USA
J
Jiaqi Zhai
Meta Platforms, Inc., USA