DualGR: Generative Retrieval with Long and Short-Term Interests Modeling

📅 2025-11-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large industrial recommender systems face three key challenges in generative retrieval (GR): coupling of long-term and short-term user interests, high noise in semantic ID (SID) generation, and insufficient modeling of negative feedback from unclicked exposed items. To address these, we propose DualGR—a dual-branch generative retrieval framework. Its core contributions are: (1) a Dual-Branch Router (DBR) that explicitly disentangles long-term user preferences from short-term intent; (2) Search-enhanced Hierarchical SID Decoding (S2D), which mitigates contextual noise via search-guided hierarchical decoding; and (3) Exposure-aware Next-item Prediction Loss (ENTP-Loss), treating unclicked exposed items as hard negatives. DualGR further incorporates cross-attention to model behavioral sequences. Deployed on Kuaishou’s short-video recommendation system, A/B testing demonstrates statistically significant improvements: +0.527% in video play rate and +0.432% in average watch time—validating DualGR’s dual advantages in relevance and response efficiency.

Technology Category

Application Category

📝 Abstract
In large-scale industrial recommendation systems, retrieval must produce high-quality candidates from massive corpora under strict latency. Recently, Generative Retrieval (GR) has emerged as a viable alternative to Embedding-Based Retrieval (EBR), which quantizes items into a finite token space and decodes candidates autoregressively, providing a scalable path that explicitly models target-history interactions via cross-attention. However, three challenges persist: 1) how to balance users' long-term and short-term interests , 2) noise interference when generating hierarchical semantic IDs (SIDs), 3) the absence of explicit modeling for negative feedback such as exposed items without clicks. To address these challenges, we propose DualGR, a generative retrieval framework that explicitly models dual horizons of user interests with selective activation. Specifically, DualGR utilizes Dual-Branch Long/Short-Term Router (DBR) to cover both stable preferences and transient intents by explicitly modeling users' long- and short-term behaviors. Meanwhile, Search-based SID Decoding (S2D) is presented to control context-induced noise and enhance computational efficiency by constraining candidate interactions to the current coarse (level-1) bucket during fine-grained (level-2/3) SID prediction. % also reinforcing intra-class consistency. Finally, we propose an Exposure-aware Next-Token Prediction Loss (ENTP-Loss) that treats "exposed-but-unclicked" items as hard negatives at level-1, enabling timely interest fade-out. On the large-scale Kuaishou short-video recommendation system, DualGR has achieved outstanding performance. Online A/B testing shows +0.527% video views and +0.432% watch time lifts, validating DualGR as a practical and effective paradigm for industrial generative retrieval.
Problem

Research questions and friction points this paper is trying to address.

Modeling long and short-term user interests in generative retrieval
Reducing noise in hierarchical semantic ID generation
Incorporating negative feedback from exposed but unclicked items
Innovation

Methods, ideas, or system contributions that make the work stand out.

Models dual user interests with selective activation
Controls noise via search-based hierarchical ID decoding
Incorporates exposure-aware negative feedback in training
🔎 Similar Papers
No similar papers found.
Z
Zhongchao Yi
University of Science and Technology of China, Heifei, China
Kai Feng
Kai Feng
Northwestern Polytechnical University
Computational imagingspectral imagingdeep learning
Xiaojian Ma
Xiaojian Ma
University of California, Los Angeles
Computer VisionMachine LearningGenerative ModelingReinforcement Learning
Y
Yalong Wang
Kuaishou Technology, Beijing, China
Y
Yongqi Liu
Kuaishou Technology, Beijing, China
H
Han Li
Kuaishou Technology, Beijing, China
Zhengyang Zhou
Zhengyang Zhou
University of Science and Technology of China
spatiotemporal data miningmachine learningdeep learningurban computing
Y
Yang Wang
University of Science and Technology of China, Heifei, China