GFlowGR: Fine-tuning Generative Recommendation Frameworks with Generative Flow Networks

📅 2025-06-19

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Generative recommendation (GR) suffers from exposure bias during fine-tuning—caused by single-step supervised fine-tuning (SFT) or direct preference optimization (DPO)—which neglects unobserved but potentially positive items. Method: This work introduces Generative Flow Networks (GFlowNets) to GR for the first time, framing recommendation as a multi-step sequential generation task. We design an adaptive trajectory sampler infused with collaborative filtering knowledge and a composite reward model to explicitly capture the distribution of unobserved positives. Additionally, we propose heuristic weighted sampling and knowledge distillation to enhance generalization. Contribution/Results: Extensive experiments on two real-world datasets and two state-of-the-art GR backbone models demonstrate significant improvements in Recall@10 and NDCG@10. Our approach effectively mitigates exposure bias while improving model robustness and generalization capability.

Technology Category

Application Category

📝 Abstract

Generative recommendations (GR), which usually include item tokenizers and generative Large Language Models (LLMs), have demonstrated remarkable success across a wide range of scenarios. The majority of existing research efforts primarily concentrate on developing powerful item tokenizers or advancing LLM decoding strategies to attain superior performance. However, the critical fine-tuning step in GR frameworks, which is essential for adapting LLMs to recommendation data, remains largely unexplored. Current approaches predominantly rely on either the next-token prediction loss of supervised fine-tuning (SFT) or recommendationspecific direct preference optimization (DPO) strategies. Both methods ignore the exploration of possible positive unobserved samples, which is commonly referred to as the exposure bias problem. To mitigate this problem, this paper treats the GR as a multi-step generation task and constructs a GFlowNets-based fine-tuning framework (GFlowGR). The proposed framework integrates collaborative knowledge from traditional recommender systems to create an adaptive trajectory sampler and a comprehensive reward model. Leveraging the diverse generation property of GFlowNets, along with sampling and heuristic weighting techniques, GFlowGR emerges as a promising approach to mitigate the exposure bias problem. Extensive empirical results on two real-world datasets and with two different GR backbones highlight the effectiveness and robustness of GFlowGR.

Problem

Research questions and friction points this paper is trying to address.

Fine-tuning generative recommendation frameworks remains largely unexplored

Current methods ignore exploration of positive unobserved samples

Exposure bias problem in generative recommendation frameworks

Innovation

Methods, ideas, or system contributions that make the work stand out.

GFlowNets-based fine-tuning for generative recommendations

Adaptive trajectory sampler with collaborative knowledge

Diverse generation to mitigate exposure bias

🔎 Similar Papers

No similar papers found.