DualGR: Generative Retrieval with Long and Short-Term Interests Modeling

📅 2025-11-16

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Large industrial recommender systems face three key challenges in generative retrieval (GR): coupling of long-term and short-term user interests, high noise in semantic ID (SID) generation, and insufficient modeling of negative feedback from unclicked exposed items. To address these, we propose DualGR—a dual-branch generative retrieval framework. Its core contributions are: (1) a Dual-Branch Router (DBR) that explicitly disentangles long-term user preferences from short-term intent; (2) Search-enhanced Hierarchical SID Decoding (S2D), which mitigates contextual noise via search-guided hierarchical decoding; and (3) Exposure-aware Next-item Prediction Loss (ENTP-Loss), treating unclicked exposed items as hard negatives. DualGR further incorporates cross-attention to model behavioral sequences. Deployed on Kuaishou’s short-video recommendation system, A/B testing demonstrates statistically significant improvements: +0.527% in video play rate and +0.432% in average watch time—validating DualGR’s dual advantages in relevance and response efficiency.

Technology Category

Application Category

📝 Abstract

In large-scale industrial recommendation systems, retrieval must produce high-quality candidates from massive corpora under strict latency. Recently, Generative Retrieval (GR) has emerged as a viable alternative to Embedding-Based Retrieval (EBR), which quantizes items into a finite token space and decodes candidates autoregressively, providing a scalable path that explicitly models target-history interactions via cross-attention. However, three challenges persist: 1) how to balance users' long-term and short-term interests , 2) noise interference when generating hierarchical semantic IDs (SIDs), 3) the absence of explicit modeling for negative feedback such as exposed items without clicks. To address these challenges, we propose DualGR, a generative retrieval framework that explicitly models dual horizons of user interests with selective activation. Specifically, DualGR utilizes Dual-Branch Long/Short-Term Router (DBR) to cover both stable preferences and transient intents by explicitly modeling users' long- and short-term behaviors. Meanwhile, Search-based SID Decoding (S2D) is presented to control context-induced noise and enhance computational efficiency by constraining candidate interactions to the current coarse (level-1) bucket during fine-grained (level-2/3) SID prediction. % also reinforcing intra-class consistency. Finally, we propose an Exposure-aware Next-Token Prediction Loss (ENTP-Loss) that treats "exposed-but-unclicked" items as hard negatives at level-1, enabling timely interest fade-out. On the large-scale Kuaishou short-video recommendation system, DualGR has achieved outstanding performance. Online A/B testing shows +0.527% video views and +0.432% watch time lifts, validating DualGR as a practical and effective paradigm for industrial generative retrieval.

Problem

Research questions and friction points this paper is trying to address.

Modeling long and short-term user interests in generative retrieval

Reducing noise in hierarchical semantic ID generation

Incorporating negative feedback from exposed but unclicked items

Innovation

Methods, ideas, or system contributions that make the work stand out.

Models dual user interests with selective activation

Controls noise via search-based hierarchical ID decoding

Incorporates exposure-aware negative feedback in training

🔎 Similar Papers

No similar papers found.