🤖 AI Summary
In federated learning (FL), large language models (LLMs) risk unintentionally memorizing participants’ sensitive training data, rendering them vulnerable to privacy-leaking prompt-based attacks. This work is the first to systematically demonstrate that Low-Rank Adaptation (LoRA) substantially suppresses LLMs’ memorization of specific input sequences. We propose a LoRA-based privacy-enhancing framework applicable to both FL and centralized fine-tuning settings, reducing memorization rates of sensitive clinical sequences by up to 10×. The method is compatible with standard privacy-preserving techniques—including gradient clipping, Gaussian noise injection, secure aggregation, and the Goldfish loss—enabling multi-layered privacy assurance. Evaluated on Llama 2 and Llama 3 models, our approach maintains competitive performance on medical question-answering tasks while significantly strengthening record-level privacy protection, thus achieving a practical balance between utility and security.
📝 Abstract
Federated learning (FL) is a popular paradigm for collaborative training which avoids direct data exposure between clients. However, data privacy issues still remain: FL-trained large language models are capable of memorizing and completing phrases and sentences contained in training data when given with their prefixes. Thus, it is possible for adversarial and honest-but-curious clients to recover training data of other participants simply through targeted prompting. In this work, we demonstrate that a popular and simple fine-tuning strategy, low-rank adaptation (LoRA), reduces memorization during FL up to a factor of 10. We study this effect by performing a medical question-answering fine-tuning task and injecting multiple replicas of out-of-distribution sensitive sequences drawn from an external clinical dataset. We observe a reduction in memorization for a wide variety of Llama 2 and 3 models, and find that LoRA can reduce memorization in centralized learning as well. Furthermore, we show that LoRA can be combined with other privacy-preserving techniques such as gradient clipping and Gaussian noising, secure aggregation, and Goldfish loss to further improve record-level privacy while maintaining performance.