Unified Value Alignment for Generative Recommendation in Industrial Advertising

πŸ“… 2026-05-07
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

194K/year
πŸ€– AI Summary
This work addresses the challenge in industrial advertising recommendation where generative approaches struggle to simultaneously capture user interests and commercial value, as existing methods predominantly focus on semantic modeling while neglecting end-to-end alignment with full-funnel business objectives. To bridge this gap, the authors propose UniVA, a novel framework that unifies value signals throughout the generative recommendation pipeline. UniVA integrates commercial value directly into tokenization via a Commercial SID tokenizer, employs a Generation-as-Ranking decoder combining supervised learning with eCPM-aware reinforcement learning, and introduces a value-guided personalized beam search strategy. Extensive experiments on WeChat Channels’ advertising platform demonstrate that UniVA improves offline Hit Rate@100 by 37.04% and achieves a statistically significant 1.5% increase in online GMV in A/B tests.
πŸ“ Abstract
Generative Recommendation (GR) reformulates recommendation as a next-token generation problem and has shown promise in industrial applications. However, extending GR to industrial advertising is non-trivial because the system must optimize not only user interest but also commercial value. Existing GR pipelines remain largely semantics-centric, making it difficult to align value signals across tokenization, decoding, and online serving. To address this issue, we propose UniVA, a Unified Value Alignment framework for advertising recommendation. We first introduce a Commercial SID tokenizer that injects value-related attributes into SID construction, yielding value-discriminative item representations. We then develop a Generation-as-Ranking SID Decoder jointly optimized by supervised learning and eCPM-aware reinforcement learning, which fuses value scores into next-item SID generation to perform generation and ranking in one decoding process. Finally, we design a value-guided personalized beam search that reuses generation-as-ranking logits as online value guidance and applies a personalized trie tree to constrain decoding to request-valid SID paths. Experiments on the Tencent WeChat Channels advertising platform show that UniVA achieves a 37.04\% improvement in offline Hit Rate@100 over the baseline and a 1.5\% GMV lift in online A/B tests.
Problem

Research questions and friction points this paper is trying to address.

Generative Recommendation
Value Alignment
Industrial Advertising
Commercial Value
Token Generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative Recommendation
Value Alignment
Commercial SID Tokenizer
Generation-as-Ranking
eCPM-aware Reinforcement Learning
πŸ”Ž Similar Papers
X
Xinxun Zhang
Wuhan University, China
Y
Yuling Xiong
Tencent Inc., China
Jiale Zhou
Jiale Zhou
MDH
requirements engineeringsafety critical systemshazard analysisontology
Z
Zhengkai Guo
Tencent Inc., China
Z
Zhennan Pang
Tencent Inc., China
J
Junbang Huo
Tencent Inc., China
J
Jingwen Wang
Tencent Inc., China
X
Xuyang Sun
Tencent Inc., China
E
Enming Zhang
Tencent Inc., China
J
Jiaguang Jin
Tencent Inc., China
C
Changping Wang
Tencent Inc., China
Y
Yi Li
Tencent Inc., China
Jun Zhang
Jun Zhang
Tencent
AI codecimage/video generationmedical image analysis
Xiao Yan
Xiao Yan
Wuhan University
Systems for Data Processing
Jiawei Jiang
Jiawei Jiang
Wuhan University
Machine Learning SystemFederated LearningGraph Learning
J
Jie Jiang
Tencent Inc., China