🤖 AI Summary
Mainstream diffusion models assume continuous data spaces and rely on symmetric Gaussian noise, which is ill-suited for inherently discrete user behavior data in recommender systems and leads to loss of personalized information. Method: We propose the first asymmetric diffusion model tailored for discrete recommendation scenarios. It introduces a generalized forward process to simulate realistic feature missing patterns, constructs a reverse denoising mechanism in an asymmetric latent space, and incorporates a task-aware optimization strategy that explicitly preserves personalized representations during generative learning. Contribution/Results: Extensive offline experiments validate its effectiveness. Deployed on the Douyin Music App, online A/B testing shows statistically significant improvements: +0.131% in active user days and +0.166% in average session duration. This work pioneers the integration of asymmetric diffusion mechanisms into recommender systems, establishing a novel paradigm for robust generative representation learning on discrete data.
📝 Abstract
Recently, motivated by the outstanding achievements of diffusion models, the diffusion process has been employed to strengthen representation learning in recommendation systems. Most diffusion-based recommendation models typically utilize standard Gaussian noise in symmetric forward and reverse processes in continuous data space. Nevertheless, the samples derived from recommendation systems inhabit a discrete data space, which is fundamentally different from the continuous one. Moreover, Gaussian noise has the potential to corrupt personalized information within latent representations. In this work, we propose a novel and effective method, named Asymmetric Diffusion Recommendation Model (AsymDiffRec), which learns forward and reverse processes in an asymmetric manner. We define a generalized forward process that simulates the missing features in real-world recommendation samples. The reverse process is then performed in an asymmetric latent feature space. To preserve personalized information within the latent representation, a task-oriented optimization strategy is introduced. In the serving stage, the raw sample with missing features is regarded as a noisy input to generate a denoising and robust representation for the final prediction. By equipping base models with AsymDiffRec, we conduct online A/B tests, achieving improvements of +0.131% and +0.166% in terms of users' active days and app usage duration respectively. Additionally, the extended offline experiments also demonstrate improvements. AsymDiffRec has been implemented in the Douyin Music App.