Retrieval Augmented Generation with Collaborative Filtering for Personalized Text Generation

📅 2025-04-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Personalized text generation suffers from the absence of explicit user similarity labels and the difficulty of effectively leveraging collaborative signals. Method: This paper proposes CFRAG, the first framework to integrate collaborative filtering into retrieval-augmented generation (RAG). It (1) constructs unsupervised user embeddings via contrastive learning to implicitly identify similar users, and (2) introduces an LLM-feedback-driven personalized retrieval and re-ranking module to model user preferences from collaborative interaction history and optimize document retrieval. Contribution/Results: The core contribution is an end-to-end coupling mechanism that bridges collaborative user signals with RAG. CFRAG achieves significant improvements over existing personalized RAG methods on the LaMP benchmark. Ablation studies confirm that collaborative information critically enhances generation quality.

Technology Category

Application Category

📝 Abstract
Recently, the personalization of Large Language Models (LLMs) to generate content that aligns with individual user preferences has garnered widespread attention. Personalized Retrieval-Augmented Generation (RAG), which retrieves relevant documents from the user's history to reflect their preferences and enhance LLM generation, is one commonly used approach for personalization. However, existing personalized RAG methods do not consider that the histories of similar users can also assist in personalized generation for the current user, meaning that collaborative information between users can also benefit personalized generation. Inspired by the application of collaborative filtering in recommender systems, we propose a method called CFRAG, which adapts Collaborative Filtering to RAG for personalized text generation. However, this presents two challenges: (1)~how to incorporate collaborative information without explicit user similarity labels? (2)~how to retrieve documents that support personalized LLM generation? For Challenge 1, we use contrastive learning to train user embeddings to retrieve similar users and introduce collaborative information. For Challenge 2, we design a personalized retriever and reranker to retrieve the top-$k$ documents from these users' histories. We take into account the user's preference during retrieval and reranking. Then we leverage feedback from the LLM to fine-tune the personalized retriever and reranker, enabling them to retrieve documents that meet the personalized generation needs of the LLM. Experimental results on the Language Model Personalization (LaMP) benchmark validate the effectiveness of CFRAG. Further analysis confirms the importance of incorporating collaborative information.
Problem

Research questions and friction points this paper is trying to address.

Personalized text generation using collaborative filtering in RAG
Incorporating similar users' histories without explicit similarity labels
Retrieving documents that support personalized LLM generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses collaborative filtering for personalized RAG
Trains user embeddings via contrastive learning
Personalized retriever and reranker with LLM feedback
🔎 Similar Papers
No similar papers found.
Teng Shi
Teng Shi
Renmin University of China
Recommender SystemInformation Retrieval
J
Jun Xu
Renmin University of China, Beijing, China
X
Xiao Zhang
Renmin University of China, Beijing, China
Xiaoxue Zang
Xiaoxue Zang
Kuaishou Technology
Recommender SystemNLPDialogueMultimodal Modeling
K
Kai Zheng
Kuaishou Technology Co., Ltd., Beijing, China
Y
Yang Song
Kuaishou Technology Co., Ltd., Beijing, China
H
Han Li
Kuaishou Technology Co., Ltd., Beijing, China