DAS: Dual-Aligned Semantic IDs Empowered Industrial Recommender System

📅 2025-08-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multimodal large language models (MLLMs) generate semantic IDs whose representations misalign with collaborative filtering (CF) signals, while conventional two-stage alignment incurs information loss and inflexible optimization. Method: We propose a single-stage dual-alignment semantic ID framework that jointly optimizes discrete quantization and cross-modal alignment. It introduces a multi-view contrastive alignment mechanism and a bidirectional dual-learning strategy to enable adaptive, flexible alignment of user- and ad-side semantic IDs. Furthermore, it integrates MLLM embeddings, ID-based CF debiasing, and triple co-occurrence structures (u2i, i2i, u2u) into a unified contrastive learning objective. Contribution/Results: Deployed across multiple advertising scenarios at Kuaishou, the method serves over 400 million users daily. Offline evaluations and online A/B tests demonstrate statistically significant improvements in recommendation accuracy and system efficiency.

Technology Category

Application Category

📝 Abstract
Semantic IDs are discrete identifiers generated by quantizing the Multi-modal Large Language Models (MLLMs) embeddings, enabling efficient multi-modal content integration in recommendation systems. However, their lack of collaborative signals results in a misalignment with downstream discriminative and generative recommendation objectives. Recent studies have introduced various alignment mechanisms to address this problem, but their two-stage framework design still leads to two main limitations: (1) inevitable information loss during alignment, and (2) inflexibility in applying adaptive alignment strategies, consequently constraining the mutual information maximization during the alignment process. To address these limitations, we propose a novel and flexible one-stage Dual-Aligned Semantic IDs (DAS) method that simultaneously optimizes quantization and alignment, preserving semantic integrity and alignment quality while avoiding the information loss typically associated with two-stage methods. Meanwhile, DAS achieves more efficient alignment between the semantic IDs and collaborative signals, with the following two innovative and effective approaches: (1) Multi-view Constrative Alignment: To maximize mutual information between semantic IDs and collaborative signals, we first incorporate an ID-based CF debias module, and then design three effective contrastive alignment methods: dual user-to-item (u2i), dual item-to-item/user-to-user (i2i/u2u), and dual co-occurrence item-to-item/user-to-user (i2i/u2u). (2) Dual Learning: By aligning the dual quantizations of users and ads, the constructed semantic IDs for users and ads achieve stronger alignment. Finally, we conduct extensive offline experiments and online A/B tests to evaluate DAS's effectiveness, which is now successfully deployed across various advertising scenarios at Kuaishou App, serving over 400 million users daily.
Problem

Research questions and friction points this paper is trying to address.

Misalignment between semantic IDs and recommendation objectives
Information loss in two-stage alignment frameworks
Inflexible adaptive alignment strategies in current methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

One-stage Dual-Aligned Semantic IDs method
Multi-view Contrastive Alignment techniques
Dual Learning for user-ad alignment
🔎 Similar Papers
No similar papers found.
W
Wencai Ye
Kuaishou Technology, Beijing, China
Mingjie Sun
Mingjie Sun
Thinking Machines Lab
Shaoyun Shi
Shaoyun Shi
Tsinghua University
RecommendationDeep Learning
P
Peng Wang
Kuaishou Technology, Beijing, China
W
Wenjin Wu
Kuaishou Technology, Beijing, China
P
Peng Jiang
Kuaishou Technology, Beijing, China