🤖 AI Summary
This work proposes the PSAD framework to address the challenges of deploying generative reranking models in practice, where achieving both high generation quality and low-latency inference remains difficult, and user-item feature interactions are often insufficient. PSAD integrates a semi-autoregressive generation mechanism to balance ranking effectiveness with inference efficiency, employs online knowledge distillation to dynamically transfer knowledge from a teacher model to a lightweight scoring network, and introduces a user profiling module to capture dynamic interests and intents, thereby enhancing personalized interactions. Extensive experiments on three large-scale public datasets demonstrate that PSAD significantly outperforms state-of-the-art methods in both ranking performance and inference efficiency.
📝 Abstract
Generative models offer a promising paradigm for the final stage reranking in multi-stage recommender systems, with the ability to capture inter-item dependencies within reranked lists. However, their practical deployment still faces two key challenges: (1) an inherent conflict between achieving high generation quality and ensuring low-latency inference, making it difficult to balance the two, and (2) insufficient interaction between user and item features in existing methods. To address these challenges, we propose a novel Personalized Semi-Autoregressive with online knowledge Distillation (PSAD) framework for reranking. In this framework, the teacher model adopts a semi-autoregressive generator to balance generation quality and efficiency, while its ranking knowledge is distilled online into a lightweight scoring network during joint training, enabling real-time and efficient inference. Furthermore, we propose a User Profile Network (UPN) that injects user intent and models interest dynamics, enabling deeper interactions between users and items. Extensive experiments conducted on three large-scale public datasets demonstrate that PSAD significantly outperforms state-of-the-art baselines in both ranking performance and inference efficiency.