Comparing Retrieval-Augmentation and Parameter-Efficient Fine-Tuning for Privacy-Preserving Personalization of Large Language Models

📅 2024-09-14
🏛️ arXiv.org
📈 Citations: 10
Influential: 1
📄 PDF
🤖 AI Summary
This study addresses user privacy preservation in large language model (LLM) personalization, systematically comparing retrieval-augmented generation (RAG) and parameter-efficient fine-tuning (PEFT) under privacy-sensitive settings. Using seven diverse datasets and multi-task benchmarks, we empirically establish their complementary performance boundaries: RAG significantly outperforms PEFT in cold-start and few-shot scenarios (average +14.92%), while PEFT gains effectiveness with increasing user data volume (+1.07% on average); their integration further improves performance to +15.98%. We propose two novel mechanisms—private user-data indexing and controllable personalized prompt construction—that enable efficient, interpretable personalization without data leaving the user domain. Our findings provide both methodological guidance and empirical baselines for privacy-first LLM customization, advancing the design of secure, adaptive, and transparent personalized LLM systems.

Technology Category

Application Category

📝 Abstract
Privacy-preserving methods for personalizing large language models (LLMs) are relatively under-explored. There are two schools of thought on this topic: (1) generating personalized outputs by personalizing the input prompt through retrieval augmentation from the user's personal information (RAG-based methods), and (2) parameter-efficient fine-tuning of LLMs per user that considers efficiency and space limitations (PEFT-based methods). This paper presents the first systematic comparison between two approaches on a wide range of personalization tasks using seven diverse datasets. Our results indicate that RAG-based and PEFT-based personalization methods on average yield 14.92% and 1.07% improvements over the non-personalized LLM, respectively. We find that combining RAG with PEFT elevates these improvements to 15.98%. Additionally, we identify a positive correlation between the amount of user data and PEFT's effectiveness, indicating that RAG is a better choice for cold-start users (i.e., user's with limited personal data).
Problem

Research questions and friction points this paper is trying to address.

Compares RAG and PEFT for privacy-preserving LLM personalization
Evaluates effectiveness of combining RAG and PEFT methods
Analyzes data volume impact on PEFT performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses retrieval-augmented generation for personalization
Employs parameter-efficient fine-tuning for user adaptation
Combines RAG and PEFT for enhanced performance
🔎 Similar Papers
No similar papers found.