🤖 AI Summary
To address the cold-start relevance matching challenge in emerging e-commerce markets—characterized by scarce tag and user behavioral data—this paper proposes the Cross-lingual Semantic Relevance Matching (CSRM) framework. First, it leverages machine translation as a pretraining task to activate cross-lingual transfer capabilities of multilingual large language models. Second, it incorporates a retrieval-augmented query understanding module to enable semantic-driven query expansion. Third, it introduces a multi-round self-distillation training strategy to mitigate annotation noise and enhance generalization under low-resource conditions. CSRM operates without human annotations and significantly reduces reliance on historical data from the target market. Online deployment results demonstrate a 45.8% reduction in system defect rate and a 0.866-percentage-point increase in session purchase rate, substantially improving search and recommendation quality in cold-start scenarios.
📝 Abstract
As global e-commerce platforms continue to expand, companies are entering new markets where they encounter cold-start challenges due to limited human labels and user behaviors. In this paper, we share our experiences in Coupang to provide a competitive cold-start performance of relevance matching for emerging e-commerce markets. Specifically, we present a Cold-Start Relevance Matching (CSRM) framework, utilizing a multilingual Large Language Model (LLM) to address three challenges: (1) activating cross-lingual transfer learning abilities of LLMs through machine translation tasks; (2) enhancing query understanding and incorporating e-commerce knowledge by retrieval-based query augmentation; (3) mitigating the impact of training label errors through a multi-round self-distillation training strategy. Our experiments demonstrate the effectiveness of CSRM-LLM and the proposed techniques, resulting in successful real-world deployment and significant online gains, with a 45.8% reduction in defect ratio and a 0.866% uplift in session purchase rate.