RankGR: Rank-Enhanced Generative Retrieval with Listwise Direct Preference Optimization in Recommendation

📅 2026-02-09

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Existing generative retrieval methods in recommendation systems rely solely on next-token prediction, which struggles to capture the hierarchical structure of user preferences and the deep interactions between items and behavioral sequences. This work proposes RankGR, the first approach to integrate listwise direct preference optimization (Listwise DPO) into generative retrieval. RankGR employs a two-stage collaborative mechanism: an Initial Assessment Phase (IAP) for coarse candidate filtering, followed by a Refinement Scoring Phase (RSP) that leverages a lightweight scoring module for fine-grained ranking. This design balances modeling depth with inference efficiency, enabling high-concurrency real-time deployment. Experiments demonstrate that RankGR significantly improves offline metrics across multiple academic and industrial datasets and delivers substantial online gains in Taobao’s “Guess You Like” scenario, stably supporting nearly 10,000 queries per second.

Technology Category

Application Category

📝 Abstract

Generative retrieval (GR) has emerged as a promising paradigm in recommendation systems by autoregressively decoding identifiers of target items. Despite its potential, current approaches typically rely on the next-token prediction schema, which treats each token of the next interacted items as the sole target. This narrow focus 1) limits their ability to capture the nuanced structure of user preferences, and 2) overlooks the deep interaction between decoded identifiers and user behavior sequences. In response to these challenges, we propose RankGR, a Rank-enhanced Generative Retrieval method that incorporates listwise direct preference optimization for recommendation. RankGR decomposes the retrieval process into two complementary stages: the Initial Assessment Phase (IAP) and the Refined Scoring Phase (RSP). In IAP, we incorporate a novel listwise direct preference optimization strategy into GR, thus facilitating a more comprehensive understanding of the hierarchical user preferences and more effective partial-order modeling. The RSP then refines the top-{\lambda} candidates generated by IAP with interactions towards input sequences using a lightweight scoring module, leading to more precise candidate evaluation. Both phases are jointly optimized under a unified GR model, ensuring consistency and efficiency. Additionally, we implement several practical improvements in training and deployment, ultimately achieving a real-time system capable of handling nearly ten thousand requests per second. Extensive offline performance on both research and industrial datasets, as well as the online gains on the"Guess You Like"section of Taobao, validate the effectiveness and scalability of RankGR.

Problem

Research questions and friction points this paper is trying to address.

Generative Retrieval

User Preferences

Next-Token Prediction

Recommendation Systems

Listwise Preference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative Retrieval

Listwise Direct Preference Optimization

Rank-enhanced Recommendation