LLaTTE: Scaling Laws for Multi-Stage Sequence Modeling in Large-Scale Ads Recommendation

📅 2026-01-27

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This work addresses the challenge of scaling sequential modeling in large-scale advertising recommendation systems under stringent latency constraints. The authors propose a scalable two-stage Transformer architecture: an upstream module asynchronously constructs rich user representations incorporating long-context and deep structural information, while a lightweight downstream model enables real-time inference. The study is the first to reveal that sequential modeling in recommendation systems follows a power-law scaling law analogous to that observed in large language models, and identifies semantic features as a critical prerequisite for effective scaling. Deployed at Meta as the largest user model to date, the approach achieves a 4.3% lift in conversion rates on Facebook Feed and Reels with minimal serving overhead.

Technology Category

Application Category

📝 Abstract

We present LLaTTE (LLM-Style Latent Transformers for Temporal Events), a scalable transformer architecture for production ads recommendation. Through systematic experiments, we demonstrate that sequence modeling in recommendation systems follows predictable power-law scaling similar to LLMs. Crucially, we find that semantic features bend the scaling curve: they are a prerequisite for scaling, enabling the model to effectively utilize the capacity of deeper and longer architectures. To realize the benefits of continued scaling under strict latency constraints, we introduce a two-stage architecture that offloads the heavy computation of large, long-context models to an asynchronous upstream user model. We demonstrate that upstream improvements transfer predictably to downstream ranking tasks. Deployed as the largest user model at Meta, this multi-stage framework drives a 4.3\% conversion uplift on Facebook Feed and Reels with minimal serving overhead, establishing a practical blueprint for harnessing scaling laws in industrial recommender systems.

Problem

Research questions and friction points this paper is trying to address.

scaling laws

sequence modeling

ads recommendation

latency constraints

large-scale recommender systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

scaling laws

multi-stage architecture

sequence modeling

ads recommendation

latent transformers

🔎 Similar Papers

No similar papers found.

TikTok

San Jose, California

Research Engineer, Monetization AI