🤖 AI Summary
Existing advertising recommendation systems suffer from objective misalignment and error propagation due to multi-stage cascaded pipelines, while generative models remain insufficient for industrial deployment. This paper proposes the first end-to-end generative pre-training framework tailored for advertising recommendation, unifying user interest modeling and ad generation into a single sequence generation task. We introduce a unified input representation with multi-level semantic ID spaces, an innovative heterogeneous hierarchical dual-decoder architecture, and integrate multi-token prediction, value-aware fine-tuning, and HEPO (Hierarchical Evolutionary Policy Optimization). Deployed at scale in WeChat Channels’ advertising system, our framework significantly improves core business metrics—including GMV and CTCVR—demonstrating the first industrial-scale application of generative advertising recommendation. It validates the effectiveness and scalability of the “one-model, full-pipeline” paradigm.
📝 Abstract
As an intelligent infrastructure connecting users with commercial content, advertising recommendation systems play a central role in information flow and value creation within the digital economy. However, existing multi-stage advertising recommendation systems suffer from objective misalignment and error propagation, making it difficult to achieve global optimality, while unified generative recommendation models still struggle to meet the demands of practical industrial applications. To address these issues, we propose GPR (Generative Pre-trained Recommender), the first one-model framework that redefines advertising recommendation as an end-to-end generative task, replacing the traditional cascading paradigm with a unified generative approach. To realize GPR, we introduce three key innovations spanning unified representation, network architecture, and training strategy. First, we design a unified input schema and tokenization method tailored to advertising scenarios, mapping both ads and organic content into a shared multi-level semantic ID space, thereby enhancing semantic alignment and modeling consistency across heterogeneous data. Second, we develop the Heterogeneous Hierarchical Decoder (HHD), a dual-decoder architecture that decouples user intent modeling from ad generation, achieving a balance between training efficiency and inference flexibility while maintaining strong modeling capacity. Finally, we propose a multi-stage joint training strategy that integrates Multi-Token Prediction (MTP), Value-Aware Fine-Tuning and the Hierarchy Enhanced Policy Optimization (HEPO) algorithm, forming a complete generative recommendation pipeline that unifies interest modeling, value alignment, and policy optimization. GPR has been fully deployed in the Tencent Weixin Channels advertising system, delivering significant improvements in key business metrics including GMV and CTCVR.