🤖 AI Summary
To address noise interference caused by user behavior uncertainty in implicit feedback, this paper proposes Personalized Loss Distribution-driven resampling denoising (PLD). Unlike global-threshold methods, PLD is the first to uncover and exploit the separability of user-level loss distributions, constructing a dynamic candidate pool and enabling individualized, loss-distribution-aware weighted resampling to adaptively suppress noisy interactions during training. PLD integrates robust dual-objective optimization—Bayesian Personalized Ranking (BPR) and Binary Cross-Entropy (BCE)—and provides theoretical guarantees on convergence and bias control. Extensive experiments on three benchmark datasets with varying noise levels demonstrate that PLD significantly outperforms state-of-the-art methods in Recall@10 and NDCG@10, exhibiting superior noise robustness and generalization stability.
📝 Abstract
While implicit feedback is foundational to modern recommender systems, factors such as human error, uncertainty, and ambiguity in user behavior inevitably introduce significant noise into this feedback, adversely affecting the accuracy and robustness of recommendations. To address this issue, existing methods typically aim to reduce the training weight of noisy feedback or discard it entirely, based on the observation that noisy interactions often exhibit higher losses in the overall loss distribution. However, we identify two key issues: (1) there is a significant overlap between normal and noisy interactions in the overall loss distribution, and (2) this overlap becomes even more pronounced when transitioning from pointwise loss functions (e.g., BCE loss) to pairwise loss functions (e.g., BPR loss). This overlap leads traditional methods to misclassify noisy interactions as normal, and vice versa. To tackle these challenges, we further investigate the loss overlap and find that for a given user, there is a clear distinction between normal and noisy interactions in the user's personal loss distribution. Based on this insight, we propose a resampling strategy to Denoise using the user's Personal Loss distribution, named PLD, which reduces the probability of noisy interactions being optimized. Specifically, during each optimization iteration, we create a candidate item pool for each user and resample the items from this pool based on the user's personal loss distribution, prioritizing normal interactions. Additionally, we conduct a theoretical analysis to validate PLD's effectiveness and suggest ways to further enhance its performance. Extensive experiments conducted on three datasets with varying noise ratios demonstrate PLD's efficacy and robustness.