Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility

📅 2025-02-24
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address the vulnerability of large language models (LLMs) to personal identifiable information (PII) leakage under adversarial attacks, this paper proposes the first cognitive science–inspired *proactive privacy forgetting* mechanism, integrating a “precise forgetting + semantically consistent memory implantation” paradigm. Methodologically, it identifies PII-strongly associated memory units via gradient sensitivity analysis, employs a differentiable forgetting gate for selective erasure, and generates semantically aligned surrogate memories to preserve model functionality. Evaluated across multiple LLMs, the mechanism achieves 100% elimination of telephone number leakage risk, reduces physical address leakage risk by 9.8%–87.6%, and incurs negligible task performance degradation (<0.5%). These results significantly outperform existing approaches that trade off privacy preservation against utility maintenance.

Technology Category

Application Category

📝 Abstract
With the rise of large language models (LLMs), increasing research has recognized their risk of leaking personally identifiable information (PII) under malicious attacks. Although efforts have been made to protect PII in LLMs, existing methods struggle to balance privacy protection with maintaining model utility. In this paper, inspired by studies of amnesia in cognitive science, we propose a novel approach, Proactive Privacy Amnesia (PPA), to safeguard PII in LLMs while preserving their utility. This mechanism works by actively identifying and forgetting key memories most closely associated with PII in sequences, followed by a memory implanting using suitable substitute memories to maintain the LLM's functionality. We conduct evaluations across multiple models to protect common PII, such as phone numbers and physical addresses, against prevalent PII-targeted attacks, demonstrating the superiority of our method compared with other existing defensive techniques. The results show that our PPA method completely eliminates the risk of phone number exposure by 100% and significantly reduces the risk of physical address exposure by 9.8% - 87.6%, all while maintaining comparable model utility performance.
Problem

Research questions and friction points this paper is trying to address.

Protecting PII in large language models
Balancing privacy with model utility
Mitigating PII leakage under malicious attacks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proactive Privacy Amnesia
Memory Implanting
PII Risk Reduction
🔎 Similar Papers
No similar papers found.
Martin Kuo
Martin Kuo
PhD Candidate, Duke University
LLMs Trustworthy AI Generative AI
J
Jingyang Zhang
Center for Computational Evolutionary Intelligence, Duke University
Jianyi Zhang
Jianyi Zhang
Research Scientist@Google Deepmind, PI@Duke University
LLMsGenerative AITrustworthy AI
Minxue Tang
Minxue Tang
Duke University
Machine LearningDeep Learning
L
Louis DiValentin
Accenture
Aolin Ding
Aolin Ding
Security Research Scientist, Accenture
J
Jingwei Sun
Center for Computational Evolutionary Intelligence, Duke University
William Chen
William Chen
Carnegie Mellon University
Spoken Language ProcessingSpeech RecognitionSpeech TranslationMachine Translation
A
Amin Hass
Accenture
Tianlong Chen
Tianlong Chen
Assistant Professor, CS@UNC Chapel Hill; Chief AI Scientist, hireEZ
Machine LearningAI4ScienceComputer VisionSparsity
Y
Yiran Chen
Center for Computational Evolutionary Intelligence, Duke University
H
Hai Li
Center for Computational Evolutionary Intelligence, Duke University