🤖 AI Summary
Cold-start items suffer from sparse collaborative signals, exacerbating the Matthew effect and undermining platform diversity. Existing approaches struggle to model the asymmetry between collaborative and content signals and capture fine-grained item-level distinctions. To address this, we propose a semantic ID-enhanced representation framework featuring a novel two-stage alignment mechanism—Residual Quantization (RQ) followed by Optimal Projection Quantization (OPQ). RQ first extracts shared collaborative patterns, while OPQ models item-specific semantic structures, enabling disentangled alignment and fusion of collaborative and content signals. Our method significantly improves click-through rate (CTR) prediction accuracy for cold-start items. Extensive offline experiments on large-scale industrial data validate its effectiveness. Online A/B testing demonstrates statistically significant improvements: +1.66% in item CTR, +1.57% in unique buyer count, and +2.17% in order volume.
📝 Abstract
With the rise of modern search and recommendation platforms, insufficient collaborative information of cold-start items exacerbates the Matthew effect of existing platform items, challenging platform diversity and becoming a longstanding issue. Existing methods align items' side content with collaborative information to transfer collaborative signals from high-popularity items to cold-start items. However, these methods fail to account for the asymmetry between collaboration and content, nor the fine-grained differences among items. To address these issues, we propose SMILE, an item representation enhancement approach based on fused alignment of semantic IDs. Specifically, we use RQ-OPQ encoding to quantize item content and collaborative information, followed by a two-step alignment: RQ encoding transfers shared collaborative signals across items, while OPQ encoding learns differentiated information of items. Comprehensive offline experiments on large-scale industrial datasets demonstrate superiority of SMILE, and rigorous online A/B tests confirm statistically significant improvements: item CTR +1.66%, buyers +1.57%, and order volume +2.17%.