SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks

📅 2025-06-12

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

To address the vulnerability of sensitive data to membership inference attacks (MIAs) during large language model (LLM) fine-tuning, this paper proposes an influence-driven selective data obfuscation defense. First, it systematically uncovers the intrinsic mechanism by which MIAs exploit loss reduction patterns during fine-tuning to infer membership. Second, it models sample importance via gradient-based influence estimation and selectively injects controllable noise only into high-influence training samples—thereby enabling explicit privacy–utility trade-offs. Extensive experiments across six domains and multiple LLM scales demonstrate that the method reduces MIA success rates by 38.7% on average while degrading fine-tuning accuracy by less than 1.2%. It significantly outperforms existing defenses, achieving both strong privacy protection and high practical utility.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have achieved remarkable success and are widely adopted for diverse applications. However, fine-tuning these models often involves private or sensitive information, raising critical privacy concerns. In this work, we conduct the first comprehensive study evaluating the vulnerability of fine-tuned LLMs to membership inference attacks (MIAs). Our empirical analysis demonstrates that MIAs exploit the loss reduction during fine-tuning, making them highly effective in revealing membership information. These findings motivate the development of our defense. We propose SOFT ( extbf{S}elective data extbf{O}bfuscation in LLM extbf{F}ine- extbf{T}uning), a novel defense technique that mitigates privacy leakage by leveraging influential data selection with an adjustable parameter to balance utility preservation and privacy protection. Our extensive experiments span six diverse domains and multiple LLM architectures and scales. Results show that SOFT effectively reduces privacy risks while maintaining competitive model performance, offering a practical and scalable solution to safeguard sensitive information in fine-tuned LLMs.

Problem

Research questions and friction points this paper is trying to address.

Protects LLM fine-tuning from membership inference attacks

Balances utility and privacy via selective data obfuscation

Addresses privacy risks in fine-tuned models across domains

Innovation

Methods, ideas, or system contributions that make the work stand out.

Selective data obfuscation for privacy protection

Adjustable parameter balances utility and privacy

Effective across diverse domains and LLM architectures

🔎 Similar Papers

Generated Data with Fake Privacy: Hidden Dangers of Fine-tuning Large Language Models on Generated Data