A Hybrid Cross-Stage Coordination Pre-ranking Model for Online Recommendation Systems

📅 2025-02-14

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

To address sample selection bias (SSB) and insufficient long-tail coverage in the pre-ranking stage of large-scale recommendation systems—caused by over-reliance on downstream ranking outputs—this paper proposes the Hybrid Cross-Stage Coordinated Pre-Ranking model (HCCP). HCCP introduces a novel cross-stage coordination paradigm that jointly models upstream retrieval signals and downstream ranking/re-ranking outputs. It explicitly mitigates SSB and enhances long-tail accuracy through three key components: hybrid sample construction, a Margin InfoNCE joint optimization objective, and multi-stage feature fusion—all implemented within a lightweight architecture. Deployed on JD.com’s e-commerce platform, HCCP achieves significant improvements: +14.9% in upstream conversion rate (UCVR) and +1.3% in upstream click-through rate (UCTR), substantially outperforming state-of-the-art methods.

Technology Category

Application Category

📝 Abstract

Large-scale recommendation systems often adopt cascading architecture consisting of retrieval, pre-ranking, ranking, and re-ranking stages. With strict latency requirements, pre-ranking utilizes lightweight models to perform a preliminary selection from massive retrieved candidates. However, recent works focus solely on improving consistency with ranking, relying exclusively on downstream stages. Since downstream input is derived from the pre-ranking output, they will exacerbate the sample selection bias (SSB) issue and Matthew effect, leading to sub-optimal results. To address the limitation, we propose a novel Hybrid Cross-Stage Coordination Pre-ranking model (HCCP) to integrate information from upstream (retrieval) and downstream (ranking, re-ranking) stages. Specifically, cross-stage coordination refers to the pre-ranking's adaptability to the entire stream and the role of serving as a more effective bridge between upstream and downstream. HCCP consists of Hybrid Sample Construction and Hybrid Objective Optimization. Hybrid sample construction captures multi-level unexposed data from the entire stream and rearranges them to become the optimal guiding"ground truth"for pre-ranking learning. Hybrid objective optimization contains the joint optimization of consistency and long-tail precision through our proposed Margin InfoNCE loss. It is specifically designed to learn from such hybrid unexposed samples, improving the overall performance and mitigating the SSB issue. The appendix describes a proof of the efficacy of the proposed loss in selecting potential positives. Extensive offline and online experiments indicate that HCCP outperforms SOTA methods by improving cross-stage coordination. It contributes up to 14.9% UCVR and 1.3% UCTR in the JD E-commerce recommendation system. Concerning code privacy, we provide a pseudocode for reference.

Problem

Research questions and friction points this paper is trying to address.

Addressing sample selection bias in pre-ranking

Improving cross-stage coordination in recommendation systems

Enhancing long-tail precision and consistency in recommendations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid Cross-Stage Coordination Pre-ranking

Hybrid Sample Construction

Margin InfoNCE loss optimization

🔎 Similar Papers

A Comprehensive Survey on Retrieval Methods in Recommender Systems