PEARL: Unbiased Percentile Estimation via Contrastive Learning for Industrial-Scale Livestream Recommendation

📅 2026-05-20

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses the distortion of feedback signals in recommender systems caused by imbalanced user activity levels, where preferences of highly active users are overamplified while those of less active users are overlooked. To mitigate this bias, the authors propose PEARL, a non-parametric contrastive percentile estimation framework that models relative user preferences rather than absolute interaction intensities. By integrating contrastive learning with percentile estimation, PEARL enables unbiased preference learning without requiring auxiliary distributions. The framework further incorporates prediction-guided percentile smoothing, value-weighted modeling, and a co-training strategy to enhance robustness. Extensive experiments demonstrate that PEARL significantly improves performance in offline multi-objective ranking. In online A/B tests on a large-scale live-streaming platform with hundreds of millions of users, it achieves notable gains of 2.10% in watch time, 0.80% in revenue, and 1.49% in engagement rate, while reducing report rates by 6.91%.

📝 Abstract

Recommender systems trained on user interaction data are susceptible to behavioral intensity imbalance--a systematic distortion arising from heterogeneous engagement patterns across users. This imbalance skews feedback signals such that observed interactions no longer faithfully reflect true preferences, causing models to disproportionately amplify signals from highly active users while underrepresenting others, which ultimately degrades recommendation quality and robustness at scale. To address this issue, we propose a nonparametric contrastive percentile approximation framework, PEARL, that models relative preference signals instead of absolute engagement magnitudes. Building upon relative advantage debiasing, PEARL leverages real contrastive interaction samples to approximate percentile relationships directly, without relying on auxiliary distribution estimation models. We provide theoretical justification demonstrating that such pairwise comparisons yield unbiased estimates of percentile-based preference signals. For broader applicability, we introduce a prediction-based bootstrapping mechanism for percentile smoothing to handle sparse and discrete feedback, alongside a generalized value-weighted formulation and a co-training strategy to enhance both modeling flexibility and representation learning. Extensive offline experiments demonstrate that PEARL effectively mitigates behavioral bias and consistently improves recommendation performance across multiple ranking targets. Deployed in a production livestream platform with a combined user base of billions, online A/B testing confirms substantial real-world gains: +2.10% Watch Duration, +0.80% Consumption Amount, +1.49% Interaction Rate, and -6.91% Report Rate.

Problem

Research questions and friction points this paper is trying to address.

behavioral intensity imbalance

recommendation bias

user engagement heterogeneity

preference distortion

feedback skew

Innovation

Methods, ideas, or system contributions that make the work stand out.

contrastive learning

percentile estimation

behavioral bias debiasing