Mitigating Unintended Memorization with LoRA in Federated Learning for LLMs

📅 2025-02-07

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

In federated learning (FL), large language models (LLMs) risk unintentionally memorizing participants’ sensitive training data, rendering them vulnerable to privacy-leaking prompt-based attacks. This work is the first to systematically demonstrate that Low-Rank Adaptation (LoRA) substantially suppresses LLMs’ memorization of specific input sequences. We propose a LoRA-based privacy-enhancing framework applicable to both FL and centralized fine-tuning settings, reducing memorization rates of sensitive clinical sequences by up to 10×. The method is compatible with standard privacy-preserving techniques—including gradient clipping, Gaussian noise injection, secure aggregation, and the Goldfish loss—enabling multi-layered privacy assurance. Evaluated on Llama 2 and Llama 3 models, our approach maintains competitive performance on medical question-answering tasks while significantly strengthening record-level privacy protection, thus achieving a practical balance between utility and security.

Technology Category

Application Category

📝 Abstract

Federated learning (FL) is a popular paradigm for collaborative training which avoids direct data exposure between clients. However, data privacy issues still remain: FL-trained large language models are capable of memorizing and completing phrases and sentences contained in training data when given with their prefixes. Thus, it is possible for adversarial and honest-but-curious clients to recover training data of other participants simply through targeted prompting. In this work, we demonstrate that a popular and simple fine-tuning strategy, low-rank adaptation (LoRA), reduces memorization during FL up to a factor of 10. We study this effect by performing a medical question-answering fine-tuning task and injecting multiple replicas of out-of-distribution sensitive sequences drawn from an external clinical dataset. We observe a reduction in memorization for a wide variety of Llama 2 and 3 models, and find that LoRA can reduce memorization in centralized learning as well. Furthermore, we show that LoRA can be combined with other privacy-preserving techniques such as gradient clipping and Gaussian noising, secure aggregation, and Goldfish loss to further improve record-level privacy while maintaining performance.

Problem

Research questions and friction points this paper is trying to address.

Reduces unintended memorization in federated learning

Enhances data privacy in large language models

Combines LoRA with other privacy-preserving techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

LoRA reduces memorization

Combines with privacy techniques

Effective in federated learning

🔎 Similar Papers

Undesirable Memorization in Large Language Models: A Survey