🤖 AI Summary
To address three key challenges in large-scale e-commerce CVR prediction for low-activity users—noisy behavioral signals, sparse user interactions, and model bias toward high-activity users—this paper proposes ChoirRec. Methodologically, it leverages large language models to construct semantic user cohorts, enabling fine-grained semantic clustering; designs a dual-channel cohort-aware architecture that supports cross-user knowledge transfer via hierarchical user representations and multi-granularity feature fusion; and introduces an adaptive fusion mechanism to mitigate signal noise and training bias. Evaluated on the Taobao platform, ChoirRec achieves a 1.16% offline GAUC improvement and a 7.24% online A/B test lift in order volume. The framework significantly enhances both the accuracy and generalizability of CVR prediction for low-activity users.
📝 Abstract
Accurately predicting conversion rates (CVR) for low-activity users remains a fundamental challenge in large-scale e-commerce recommender systems.Existing approaches face three critical limitations: (i) reliance on noisy and unreliable behavioral signals; (ii) insufficient user-level information due to the lack of diverse interaction data; and (iii) a systemic training bias toward high-activity users that overshadows the needs of low-activity users.To address these challenges, we propose ChoirRec, a novel framework that leverages the semantic capabilities of Large Language Models (LLMs) to construct semantic user groups and enhance CVR prediction for low-activity users.With a dual-channel architecture designed for robust cross-user knowledge transfer, ChoirRec comprises three components: (i) a Semantic Group Generation module that utilizes LLMs to form reliable, cross-activity user clusters, thereby filtering out noisy signals; (ii) a Group-aware Hierarchical Representation module that enriches sparse user embeddings with informative group-level priors to mitigate data insufficiency; and (iii) a Group-aware Multi-granularity Modual that employs a dual-channel architecture and adaptive fusion mechanism to ensure effective learning and utilization of group knowledge. We conduct extensive offline and online experiments on Taobao, a leading industrial-scale e-commerce platform.ChoirRec improves GAUC by 1.16% in offline evaluations, while online A/B testing reveals a 7.24% increase in order volume, highlighting its substantial practical value in real-world applications.