GFlowGR: Fine-tuning Generative Recommendation Frameworks with Generative Flow Networks

📅 2025-06-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Generative recommendation (GR) suffers from exposure bias during fine-tuning—caused by single-step supervised fine-tuning (SFT) or direct preference optimization (DPO)—which neglects unobserved but potentially positive items. Method: This work introduces Generative Flow Networks (GFlowNets) to GR for the first time, framing recommendation as a multi-step sequential generation task. We design an adaptive trajectory sampler infused with collaborative filtering knowledge and a composite reward model to explicitly capture the distribution of unobserved positives. Additionally, we propose heuristic weighted sampling and knowledge distillation to enhance generalization. Contribution/Results: Extensive experiments on two real-world datasets and two state-of-the-art GR backbone models demonstrate significant improvements in Recall@10 and NDCG@10. Our approach effectively mitigates exposure bias while improving model robustness and generalization capability.

Technology Category

Application Category

📝 Abstract
Generative recommendations (GR), which usually include item tokenizers and generative Large Language Models (LLMs), have demonstrated remarkable success across a wide range of scenarios. The majority of existing research efforts primarily concentrate on developing powerful item tokenizers or advancing LLM decoding strategies to attain superior performance. However, the critical fine-tuning step in GR frameworks, which is essential for adapting LLMs to recommendation data, remains largely unexplored. Current approaches predominantly rely on either the next-token prediction loss of supervised fine-tuning (SFT) or recommendationspecific direct preference optimization (DPO) strategies. Both methods ignore the exploration of possible positive unobserved samples, which is commonly referred to as the exposure bias problem. To mitigate this problem, this paper treats the GR as a multi-step generation task and constructs a GFlowNets-based fine-tuning framework (GFlowGR). The proposed framework integrates collaborative knowledge from traditional recommender systems to create an adaptive trajectory sampler and a comprehensive reward model. Leveraging the diverse generation property of GFlowNets, along with sampling and heuristic weighting techniques, GFlowGR emerges as a promising approach to mitigate the exposure bias problem. Extensive empirical results on two real-world datasets and with two different GR backbones highlight the effectiveness and robustness of GFlowGR.
Problem

Research questions and friction points this paper is trying to address.

Fine-tuning generative recommendation frameworks remains largely unexplored
Current methods ignore exploration of positive unobserved samples
Exposure bias problem in generative recommendation frameworks
Innovation

Methods, ideas, or system contributions that make the work stand out.

GFlowNets-based fine-tuning for generative recommendations
Adaptive trajectory sampler with collaborative knowledge
Diverse generation to mitigate exposure bias
🔎 Similar Papers
No similar papers found.
Yejing Wang
Yejing Wang
City University of Hong Kong
S
Shengyu Zhou
Alibaba Group
J
Jinyu Lu
Alibaba Group
Qidong Liu
Qidong Liu
Assistant Professor, Xi'an Jiaotong University
Recommender SystemLarge Language ModelIntelligent HealthcareCausal InferenceSmart Education
Xinhang Li
Xinhang Li
Tsinghua University
Recommender SystemKnowledge GraphTransfer Learning
W
Wenlin Zhang
City University of Hong Kong
F
Feng Li
Alibaba Group
P
Pengjie Wang
Alibaba Group
J
Jian Xu
Alibaba Group
B
Bo Zheng
Alibaba Group
X
Xiangyu Zhao
City University of Hong Kong