Long-Sequence Recommendation Models Need Decoupled Embeddings

πŸ“… 2024-10-03
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
In long-sequence recommendation, a single embedding table jointly serves both attention computation and user representation learning, causing functional coupling that impairs interest modeling and prediction accuracy. This paper is the first to systematically identify and address this embedding functional coupling issue, proposing DAREβ€”a decoupled attention and representation embedding paradigm. DARE introduces separate embedding spaces for attention and representation, enabling dimensionality reduction and efficient approximate nearest-neighbor search; it further incorporates multi-stage behavior retrieval and attention mechanism optimization. Evaluated on public benchmarks, DARE achieves up to a 0.9% AUC improvement. Deployed on Tencent’s advertising platform, it significantly enhances online performance. Moreover, retrieval speed increases by 50%, substantially improving online serving efficiency.

Technology Category

Application Category

πŸ“ Abstract
Lifelong user behavior sequences are crucial for capturing user interests and predicting user responses in modern recommendation systems. A two-stage paradigm is typically adopted to handle these long sequences: a subset of relevant behaviors is first searched from the original long sequences via an attention mechanism in the first stage and then aggregated with the target item to construct a discriminative representation for prediction in the second stage. In this work, we identify and characterize, for the first time, a neglected deficiency in existing long-sequence recommendation models: a single set of embeddings struggles with learning both attention and representation, leading to interference between these two processes. Initial attempts to address this issue with some common methods (e.g., linear projections -- a technique borrowed from language processing) proved ineffective, shedding light on the unique challenges of recommendation models. To overcome this, we propose the Decoupled Attention and Representation Embeddings (DARE) model, where two distinct embedding tables are initialized and learned separately to fully decouple attention and representation. Extensive experiments and analysis demonstrate that DARE provides more accurate searches of correlated behaviors and outperforms baselines with AUC gains up to 0.9% on public datasets and notable improvements on Tencent's advertising platform. Furthermore, decoupling embedding spaces allows us to reduce the attention embedding dimension and accelerate the search procedure by 50% without significant performance impact, enabling more efficient, high-performance online serving. Code in PyTorch for experiments, including model analysis, is available at https://github.com/thuml/DARE.
Problem

Research questions and friction points this paper is trying to address.

Single embeddings struggle with learning attention and representation.
Existing methods fail to address interference in recommendation models.
Proposed DARE model decouples embeddings for improved performance and efficiency.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decoupled Attention and Representation Embeddings (DARE)
Separate embedding tables for attention and representation
Reduced attention embedding dimension for efficiency
πŸ”Ž Similar Papers
2024-02-02ACM Transactions on Recommender SystemsCitations: 1
N
Ningya Feng
School of Software, BNRist, Tsinghua University, China
Junwei Pan
Junwei Pan
Tencent, Yahoo Research
Computational AdvertisingRecommendation SystemDeep Learning
J
Jialong Wu
School of Software, BNRist, Tsinghua University, China
Baixu Chen
Baixu Chen
Master Student, Tsinghua University
machine learningdeep Learning
Ximei Wang
Ximei Wang
Tencent Inc, China
Q
Qian Li
Tencent Inc, China
X
Xian Hu
Tencent Inc, China
J
Jie Jiang
Tencent Inc, China
Mingsheng Long
Mingsheng Long
Associate Professor, Tsinghua University
Machine learningdeep learningtransfer learningscientific machine learning