LLaTTE: Scaling Laws for Multi-Stage Sequence Modeling in Large-Scale Ads Recommendation

📅 2026-01-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of scaling sequential modeling in large-scale advertising recommendation systems under stringent latency constraints. The authors propose a scalable two-stage Transformer architecture: an upstream module asynchronously constructs rich user representations incorporating long-context and deep structural information, while a lightweight downstream model enables real-time inference. The study is the first to reveal that sequential modeling in recommendation systems follows a power-law scaling law analogous to that observed in large language models, and identifies semantic features as a critical prerequisite for effective scaling. Deployed at Meta as the largest user model to date, the approach achieves a 4.3% lift in conversion rates on Facebook Feed and Reels with minimal serving overhead.

Technology Category

Application Category

📝 Abstract
We present LLaTTE (LLM-Style Latent Transformers for Temporal Events), a scalable transformer architecture for production ads recommendation. Through systematic experiments, we demonstrate that sequence modeling in recommendation systems follows predictable power-law scaling similar to LLMs. Crucially, we find that semantic features bend the scaling curve: they are a prerequisite for scaling, enabling the model to effectively utilize the capacity of deeper and longer architectures. To realize the benefits of continued scaling under strict latency constraints, we introduce a two-stage architecture that offloads the heavy computation of large, long-context models to an asynchronous upstream user model. We demonstrate that upstream improvements transfer predictably to downstream ranking tasks. Deployed as the largest user model at Meta, this multi-stage framework drives a 4.3\% conversion uplift on Facebook Feed and Reels with minimal serving overhead, establishing a practical blueprint for harnessing scaling laws in industrial recommender systems.
Problem

Research questions and friction points this paper is trying to address.

scaling laws
sequence modeling
ads recommendation
latency constraints
large-scale recommender systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

scaling laws
multi-stage architecture
sequence modeling
ads recommendation
latent transformers
🔎 Similar Papers
No similar papers found.
L
Lee Xiong
AI at Meta
Zhirong Chen
Zhirong Chen
Master, Institute of Computing Technology, Chinese Academy of Sciences
Computer ArchitectureMachine Learning
R
Rahul Mayuranath
AI at Meta
S
Shangran Qiu
A
Arda Ozdemir
L
Lu Li
Y
Yang Hu
D
Dave Li
J
Jingtao Ren
Howard Cheng
Howard Cheng
University of Lethbridge
F
Fabian Souto Herrera
A
A. Agiza
B
Baruch Epshtein
A
Anuj Aggarwal
J
Julia Ulziisaikhan
C
Chao Wang
Dinesh Ramasamy
Dinesh Ramasamy
Meta
Recommendation systemsMachine learningSequence modeling
P
Parshva Doshi
S
Sri Reddy
Arnold Overwijk
Arnold Overwijk
Unknown affiliation
Language ModelsRecommendationInformation RetrievalNatural Language Understanding