🤖 AI Summary
Industrial recommendation systems face challenges in the retrieval stage: achieving high recall while adequately covering long-tail and diverse user interests—largely due to limited cross-tower interaction and popularity bias in dual-tower models. To address this, we propose a multi-embedding retrieval framework that integrates a differentiable clustering module (DCM) with explicit conditional retrieval (CR). DCM unsupervisedly discovers implicit interest clusters from raw user behavior sequences; CR performs condition-aware representation learning guided by explicit signals (e.g., topical attention). This enables fine-grained, complementary interest modeling. The framework extends the dual-tower architecture to support collaborative representation of heterogeneous interest sources. Deployed on Pinterest’s homepage, A/B testing demonstrates statistically significant improvements: increased user engagement rate and a 12.3% lift in content diversity metrics—validating enhanced long-tail interest coverage and generalization capability.
📝 Abstract
Industrial recommendation systems are typically composed of multiple stages, including retrieval, ranking, and blending. The retrieval stage plays a critical role in generating a high-recall set of candidate items that covers a wide range of diverse user interests. Effectively covering the diverse and long-tail user interests within this stage poses a significant challenge: traditional two-tower models struggle in this regard due to limited user-item feature interaction and often bias towards top use cases. To address these issues, we propose a novel multi-embedding retrieval framework designed to enhance user interest representation by generating multiple user embeddings conditioned on both implicit and explicit user interests. Implicit interests are captured from user history through a Differentiable Clustering Module (DCM), whereas explicit interests, such as topics that the user has followed, are modeled via Conditional Retrieval (CR). These methodologies represent a form of conditioned user representation learning that involves condition representation construction and associating the target item with the relevant conditions. Synergizing implicit and explicit user interests serves as a complementary approach to achieve more effective and comprehensive candidate retrieval as they benefit on different user segments and extract conditions from different but supplementary sources. Extensive experiments and A/B testing reveal significant improvements in user engagements and feed diversity metrics. Our proposed framework has been successfully deployed on Pinterest home feed.