🤖 AI Summary
To address three key challenges in industrial recommendation systems—insufficient long-term behavioral modeling due to short-sequence constraints, lack of behavioral prediction synergy within pointwise ranking frameworks, and inefficiency in deploying long-sequence models—this paper proposes a long-sequence CTR modeling paradigm tailored for Pinterest’s homepage. Methodologically: (1) it integrates ultra-long user behavior sequences (up to tens of thousands of items) into a production-grade CTR model for the first time, leveraging a truncation-and-sampling strategy with a Transformer architecture; (2) it introduces a Next Action Loss multi-task objective to jointly optimize CTR estimation and sequential action prediction within a pointwise ranking framework; and (3) it designs a lightweight serving architecture combining model distillation and operator-level optimizations. Experiments demonstrate a 0.8% AUC gain, a 23% improvement in Recall@10 for action prediction, inference latency ≤15 ms, and stable support for over 10 billion daily requests.
📝 Abstract
Modeling user action sequences has become a popular focus in industrial recommendation system research, particularly for Click-Through Rate (CTR) prediction tasks. However, industry-scale CTR models often rely on short user sequences, limiting their ability to capture long-term behavior. Additionally, these models typically lack an integrated action-prediction task within a point-wise ranking framework, reducing their predictive power. They also rarely address the infrastructure challenges involved in efficiently serving large-scale sequential models. In this paper, we introduce TransAct V2, a production model for Pinterest's Homefeed ranking system, featuring three key innovations: (1) leveraging very long user sequences to improve CTR predictions, (2) integrating a Next Action Loss function for enhanced user action forecasting, and (3) employing scalable, low-latency deployment solutions tailored to handle the computational demands of extended user action sequences.