Multi-Faceted Large Embedding Tables for Pinterest Ads Ranking

📅 2025-08-06

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

To address sparsity, cold-start issues, training non-convergence (neutral metrics), and inference latency arising from large embedding tables in Pinterest’s ad ranking system, this paper proposes a multifaceted large-embedding-table framework. First, it introduces a multi-objective pretraining mechanism integrating contrastive learning, masked reconstruction, and graph-based collaborative modeling to mitigate representation degradation inherent in end-to-end training. Second, it designs a CPU-GPU hybrid inference architecture that alleviates memory bottlenecks while maintaining low end-to-end latency. Third, it enables efficient training and real-time serving of high-dimensional sparse features. Online A/B experiments demonstrate a 2.60% lift in CTR and a 1.34% reduction in CPC. The framework has been fully deployed in Pinterest’s production advertising system.

Technology Category

Application Category

📝 Abstract

Large embedding tables are indispensable in modern recommendation systems, thanks to their ability to effectively capture and memorize intricate details of interactions among diverse entities. As we explore integrating large embedding tables into Pinterest's ads ranking models, we encountered not only common challenges such as sparsity and scalability, but also several obstacles unique to our context. Notably, our initial attempts to train large embedding tables from scratch resulted in neutral metrics. To tackle this, we introduced a novel multi-faceted pretraining scheme that incorporates multiple pretraining algorithms. This approach greatly enriched the embedding tables and resulted in significant performance improvements. As a result, the multi-faceted large embedding tables bring great performance gain on both the Click-Through Rate (CTR) and Conversion Rate (CVR) domains. Moreover, we designed a CPU-GPU hybrid serving infrastructure to overcome GPU memory limits and elevate the scalability. This framework has been deployed in the Pinterest Ads system and achieved 1.34% online CPC reduction and 2.60% CTR increase with neutral end-to-end latency change.

Problem

Research questions and friction points this paper is trying to address.

Addressing sparsity and scalability in large embedding tables

Improving training effectiveness for Pinterest ads ranking models

Overcoming GPU memory limits with hybrid serving infrastructure

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-faceted pretraining scheme for embeddings

CPU-GPU hybrid serving infrastructure

Improved CTR and CVR performance metrics

🔎 Similar Papers

No similar papers found.