SMILE: SeMantic Ids Enhanced CoLd Item Representation for Click-through Rate Prediction in E-commerce SEarch

📅 2025-10-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Cold-start items suffer from sparse collaborative signals, exacerbating the Matthew effect and undermining platform diversity. Existing approaches struggle to model the asymmetry between collaborative and content signals and capture fine-grained item-level distinctions. To address this, we propose a semantic ID-enhanced representation framework featuring a novel two-stage alignment mechanism—Residual Quantization (RQ) followed by Optimal Projection Quantization (OPQ). RQ first extracts shared collaborative patterns, while OPQ models item-specific semantic structures, enabling disentangled alignment and fusion of collaborative and content signals. Our method significantly improves click-through rate (CTR) prediction accuracy for cold-start items. Extensive offline experiments on large-scale industrial data validate its effectiveness. Online A/B testing demonstrates statistically significant improvements: +1.66% in item CTR, +1.57% in unique buyer count, and +2.17% in order volume.

Technology Category

Application Category

📝 Abstract
With the rise of modern search and recommendation platforms, insufficient collaborative information of cold-start items exacerbates the Matthew effect of existing platform items, challenging platform diversity and becoming a longstanding issue. Existing methods align items' side content with collaborative information to transfer collaborative signals from high-popularity items to cold-start items. However, these methods fail to account for the asymmetry between collaboration and content, nor the fine-grained differences among items. To address these issues, we propose SMILE, an item representation enhancement approach based on fused alignment of semantic IDs. Specifically, we use RQ-OPQ encoding to quantize item content and collaborative information, followed by a two-step alignment: RQ encoding transfers shared collaborative signals across items, while OPQ encoding learns differentiated information of items. Comprehensive offline experiments on large-scale industrial datasets demonstrate superiority of SMILE, and rigorous online A/B tests confirm statistically significant improvements: item CTR +1.66%, buyers +1.57%, and order volume +2.17%.
Problem

Research questions and friction points this paper is trying to address.

Addressing cold-start item representation in e-commerce CTR prediction
Mitigating asymmetry between collaborative and content information
Enhancing fine-grained item differentiation using semantic IDs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses RQ-OPQ encoding for item quantization
Aligns collaborative signals via RQ encoding
Learns item differentiation through OPQ encoding
🔎 Similar Papers
No similar papers found.
Q
Qihang Zhao
Kuaishou Inc.
Z
Zhongbo Sun
Kuaishou Technology
X
Xiaoyang Zheng
Kuaishou Technology
X
Xian Guo
Kuaishou Technology
S
Siyuan Wang
Kuaishou Technology
Z
Zihan Liang
Kuaishou Technology
M
Mingcan Peng
Kuaishou Technology
Ben Chen
Ben Chen
KuaiShou, Alibaba, HUST, WHU
MultimodalLLMGenerative RecommendationSemantic Matching
Chenyi Lei
Chenyi Lei
Kuaishou Technology
Recommender SystemInformation RetrievalGenerative RecommendationMultimodal