RQ-GMM: Residual Quantized Gaussian Mixture Model for Multimodal Semantic Discretization in CTR Prediction

📅 2026-02-13
📈 Citations: 0
Influential: 0
📄 PDF

Technology Category

Application Category

📝 Abstract
Multimodal content is crucial for click-through rate (CTR) prediction. However, directly incorporating continuous embeddings from pre-trained models into CTR models yields suboptimal results due to misaligned optimization objectives and convergence speed inconsistency during joint training. Discretizing embeddings into semantic IDs before feeding them into CTR models offers a more effective solution, yet existing methods suffer from limited codebook utilization, reconstruction accuracy, and semantic discriminability. We propose RQ-GMM (Residual Quantized Gaussian Mixture Model), which introduces probabilistic modeling to better capture the statistical structure of multimodal embedding spaces. Through Gaussian Mixture Models combined with residual quantization, RQ-GMM achieves superior codebook utilization and reconstruction accuracy. Experiments on public datasets and online A/B tests on a large-scale short-video platform serving hundreds of millions of users demonstrate substantial improvements: RQ-GMM yields a 1.502% gain in Advertiser Value over strong baselines. The method has been fully deployed, serving daily recommendations for hundreds of millions of users.
Problem

Research questions and friction points this paper is trying to address.

CTR prediction
multimodal embedding
semantic discretization
codebook utilization
reconstruction accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Residual Quantization
Gaussian Mixture Model
Multimodal Discretization
CTR Prediction
Semantic Codebook
🔎 Similar Papers
No similar papers found.
Z
Ziye Tong
Tencent
J
Jiahao Liu
Fudan University
W
Weimin Zhang
Tencent
H
Hongji Ruan
Beijing Jiaotong University
D
Derick Tang
Tencent
Zhanpeng Zeng
Zhanpeng Zeng
University of Wisconsin Madison
Transformer Efficiency
Q
Qinsong Zeng
Tencent
P
Peng Zhang
Fudan University
T
Tun Lu
Fudan University
Ning Gu
Ning Gu
Fudan University
Collaborative ComputingCSCWSocial ComputingHuman Computer InteractionRecommendation