🤖 AI Summary
This paper identifies a novel “popularity bias inheritance” phenomenon in cold-start recommendation: collaborative filtering–based cold-start methods inherit and amplify the base model’s inherent preference for popular items, leading to inaccurate popularity estimation for cold items solely from content features and exacerbating long-tail recommendation bias. To address this, we propose embedding magnitude as a lightweight, proxy-based popularity indicator and design an efficient post-processing debiasing strategy. Extensive experiments across three multimedia datasets confirm the ubiquity of this phenomenon. Our approach achieves substantial mitigation of popularity bias—reducing it by 32.7% on average—while preserving user-oriented accuracy (e.g., Recall@K) nearly intact. Consequently, it significantly improves exposure fairness for tail items and enhances recommendation diversity.
📝 Abstract
Collaborative filtering (CF) recommender systems struggle with making predictions on unseen, or 'cold', items. Systems designed to address this challenge are often trained with supervision from warm CF models in order to leverage collaborative and content information from the available interaction data. However, since they learn to replicate the behavior of CF methods, cold-start models may therefore also learn to imitate their predictive biases. In this paper, we show that cold-start systems can inherit popularity bias, a common cause of recommender system unfairness arising when CF models overfit to more popular items, thereby maximizing user-oriented accuracy but neglecting rarer items. We demonstrate that cold-start recommenders not only mirror the popularity biases of warm models, but are in fact affected more severely: because they cannot infer popularity from interaction data, they instead attempt to estimate it based solely on content features. This leads to significant over-prediction of certain cold items with similar content to popular warm items, even if their ground truth popularity is very low. Through experiments on three multimedia datasets, we analyze the impact of this behavior on three generative cold-start methods. We then describe a simple post-processing bias mitigation method that, by using embedding magnitude as a proxy for predicted popularity, can produce more balanced recommendations with limited harm to user-oriented cold-start accuracy.