Synergizing Implicit and Explicit User Interests: A Multi-Embedding Retrieval Framework at Pinterest

📅 2025-06-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Industrial recommendation systems face challenges in the retrieval stage: achieving high recall while adequately covering long-tail and diverse user interests—largely due to limited cross-tower interaction and popularity bias in dual-tower models. To address this, we propose a multi-embedding retrieval framework that integrates a differentiable clustering module (DCM) with explicit conditional retrieval (CR). DCM unsupervisedly discovers implicit interest clusters from raw user behavior sequences; CR performs condition-aware representation learning guided by explicit signals (e.g., topical attention). This enables fine-grained, complementary interest modeling. The framework extends the dual-tower architecture to support collaborative representation of heterogeneous interest sources. Deployed on Pinterest’s homepage, A/B testing demonstrates statistically significant improvements: increased user engagement rate and a 12.3% lift in content diversity metrics—validating enhanced long-tail interest coverage and generalization capability.

Technology Category

Application Category

📝 Abstract
Industrial recommendation systems are typically composed of multiple stages, including retrieval, ranking, and blending. The retrieval stage plays a critical role in generating a high-recall set of candidate items that covers a wide range of diverse user interests. Effectively covering the diverse and long-tail user interests within this stage poses a significant challenge: traditional two-tower models struggle in this regard due to limited user-item feature interaction and often bias towards top use cases. To address these issues, we propose a novel multi-embedding retrieval framework designed to enhance user interest representation by generating multiple user embeddings conditioned on both implicit and explicit user interests. Implicit interests are captured from user history through a Differentiable Clustering Module (DCM), whereas explicit interests, such as topics that the user has followed, are modeled via Conditional Retrieval (CR). These methodologies represent a form of conditioned user representation learning that involves condition representation construction and associating the target item with the relevant conditions. Synergizing implicit and explicit user interests serves as a complementary approach to achieve more effective and comprehensive candidate retrieval as they benefit on different user segments and extract conditions from different but supplementary sources. Extensive experiments and A/B testing reveal significant improvements in user engagements and feed diversity metrics. Our proposed framework has been successfully deployed on Pinterest home feed.
Problem

Research questions and friction points this paper is trying to address.

Enhancing user interest representation with multiple embeddings
Combining implicit and explicit user interests for retrieval
Improving candidate recall and diversity in recommendation systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-embedding framework for diverse user interests
Differentiable Clustering Module captures implicit interests
Conditional Retrieval models explicit user interests
🔎 Similar Papers
No similar papers found.
Z
Zhibo Fan
Pinterest, San Francisco, CA, USA
H
Hongtao Lin
Pinterest, San Francisco, CA, USA
H
Haoyu Chen
Pinterest, San Francisco, CA, USA
Bowen Deng
Bowen Deng
Postdoc at MIT | PhD at UC Berkeley
Machine LearningAI for ScienceComputational MaterialsEnergy Materials
H
Hedi Xia
Pinterest, San Francisco, CA, USA
Y
Yuke Yan
Pinterest, San Francisco, CA, USA
J
James Li
Pinterest, San Francisco, CA, USA