Warmer for Less: A Cost-Efficient Strategy for Cold-Start Recommendations at Pinterest

📅 2025-12-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address poor cold-start (CS) item recommendation performance, insufficient feature utilization, sparse tagging, and low prediction scores on visual discovery platforms like Pinterest, this paper proposes a lightweight, four-dimensional collaborative optimization framework designed for industrial deployment. Under a strict constraint of ≤5% parameter increase, the framework integrates residual feature enhancement, prediction score regularization and calibration, Manifold Mixup-based manifold-aware data augmentation, and an efficient model architecture. It incurs no additional computational overhead while effectively mitigating representation degradation and distribution shift in CS scenarios. Online A/B experiments demonstrate a 10% improvement in engagement rate for fresh content; the system has been stably deployed to serve over 570 million users. This work establishes a scalable, high-impact collaborative optimization paradigm for cold-start challenges in large-scale visual recommendation systems.

Technology Category

Application Category

📝 Abstract
Pinterest is a leading visual discovery platform where recommender systems (RecSys) are key to delivering relevant, engaging, and fresh content to our users. In this paper, we study the problem of improving RecSys model predictions for cold-start (CS) items, which appear infrequently in the training data. Although this problem is well-studied in academia, few studies have addressed its root causes effectively at the scale of a platform like Pinterest. By investigating live traffic data, we identified several challenges of the CS problem and developed a corresponding solution for each: First, industrial-scale RecSys models must operate under tight computational constraints. Since CS items are a minority, any related improvements must be highly cost-efficient. To address this, our solutions were designed to be lightweight, collectively increasing the total parameters by only 5%. Second, CS items are represented only by non-historical (e.g., content or attribute) features, which models often treat as less important. To elevate their significance, we introduce a residual connection for the non-historical features. Third, CS items tend to receive lower prediction scores compared to non-CS items, reducing their likelihood of being surfaced. We mitigate this by incorporating a score regularization term into the model. Fourth, the labels associated with CS items are sparse, making it difficult for the model to learn from them. We apply the manifold mixup technique to address this data sparsity. Implemented together, our methods increased fresh content engagement at Pinterest by 10% without negatively impacting overall engagement and cost, and have been deployed to serve over 570 million users on Pinterest.
Problem

Research questions and friction points this paper is trying to address.

Improving cold-start item predictions in recommender systems
Addressing computational constraints for cost-efficient model enhancements
Mitigating data sparsity and feature importance issues for cold-start items
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight design with only 5% parameter increase
Residual connection for non-historical feature enhancement
Score regularization and manifold mixup for data sparsity
🔎 Similar Papers
No similar papers found.
Saeed Ebrahimi
Saeed Ebrahimi
PhD student at West Virginia University
Machine learningComputer VisionBiometrics
W
Weijie Jiang
Pinterest, San Francisco, CA, USA
J
Jaewon Yang
Pinterest, San Francisco, CA, USA
O
Olafur Gudmundsson
Pinterest, San Francisco, CA, USA
Y
Yucheng Tu
Pinterest, San Francisco, CA, USA
H
Huizhong Duan
Pinterest, San Francisco, CA, USA