PCR-CA: Parallel Codebook Representations with Contrastive Alignment for Multiple-Category App Recommendation

📅 2025-08-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient personalization in recommendation systems caused by semantic overlap among multi-category applications, this paper proposes PCR-CA, an end-to-end CTR prediction framework. Its core contributions are threefold: (1) a parallel codebook vector quantized autoencoder (VQ-AE) that discretizes and disentangles multi-dimensional application semantics; (2) user-level and item-level contrastive alignment losses that explicitly integrate semantic representations with collaborative signals; and (3) a dual-attention mechanism jointly modeling ID-based and semantic features. Extensive experiments on large-scale industrial datasets demonstrate improvements of +0.76% in AUC overall and +2.15% for long-tail applications. Online A/B testing shows significant gains of +10.52% in CTR and +16.30% in CVR. The framework has been fully deployed in the Microsoft App Store.

Technology Category

Application Category

📝 Abstract
Modern app store recommender systems struggle with multiple-category apps, as traditional taxonomies fail to capture overlapping semantics, leading to suboptimal personalization. We propose PCR-CA (Parallel Codebook Representations with Contrastive Alignment), an end-to-end framework for improved CTR prediction. PCR-CA first extracts compact multimodal embeddings from app text, then introduces a Parallel Codebook VQ-AE module that learns discrete semantic representations across multiple codebooks in parallel -- unlike hierarchical residual quantization (RQ-VAE). This design enables independent encoding of diverse aspects (e.g., gameplay, art style), better modeling multiple-category semantics. To bridge semantic and collaborative signals, we employ a contrastive alignment loss at both the user and item levels, enhancing representation learning for long-tail items. Additionally, a dual-attention fusion mechanism combines ID-based and semantic features to capture user interests, especially for long-tail apps. Experiments on a large-scale dataset show PCR-CA achieves a +0.76% AUC improvement over strong baselines, with +2.15% AUC gains for long-tail apps. Online A/B testing further validates our approach, showing a +10.52% lift in CTR and a +16.30% improvement in CVR, demonstrating PCR-CA's effectiveness in real-world deployment. The new framework has now been fully deployed on the Microsoft Store.
Problem

Research questions and friction points this paper is trying to address.

Improving CTR prediction for multiple-category app recommendations
Addressing overlapping semantics in traditional app taxonomies
Enhancing representation learning for long-tail items
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel Codebook VQ-AE for discrete semantic representations
Contrastive alignment loss bridges semantic and collaborative signals
Dual-attention fusion combines ID-based and semantic features
🔎 Similar Papers
No similar papers found.
Bin Tan
Bin Tan
Ph.D Student, Wuhan University
Computer Vision
W
Wangyao Ge
Microsoft Store, China
Y
Yidi Wang
Microsoft Store, China
X
Xin Liu
Microsoft Store, USA
J
Jeff Burtoft
Microsoft Store, USA
Hao Fan
Hao Fan
Zhejiang A&F University
Recommender System
H
Hui Wang
Microsoft Store, China