Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training

📅 2025-02-21

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work exposes dynamic privacy cascades in large language model (LLM) training triggered by the insertion or removal of personally identifiable information (PII). Challenging the conventional static unlearning assumption, we identify and empirically validate three novel ripple effects: auxiliary memory activation, additive amplification of memory, and deletion-induced reverse enhancement. Through controlled PII injection/removal experiments, sequence-level memory quantification, training trajectory analysis, and multi-stage fine-tuning comparisons, we find that: (i) later-occurring semantically similar PII reactivates earlier-sequence memories (up to one-third of cases); (ii) inserting PII amplifies memory strength for other PII instances by up to 7.5×; and (iii) deleting PII paradoxically strengthens memory retention of remaining PII. We are the first to characterize the dynamic coupling mechanism governing PII memorization—driven by data scheduling, temporal evolution, and distributional shift—and reveal synergistic amplification between first-order (direct) and second-order (indirect) privacy risks.

Technology Category

Application Category

📝 Abstract

Due to the sensitive nature of personally identifiable information (PII), its owners may have the authority to control its inclusion or request its removal from large-language model (LLM) training. Beyond this, PII may be added or removed from training datasets due to evolving dataset curation techniques, because they were newly scraped for retraining, or because they were included in a new downstream fine-tuning stage. We find that the amount and ease of PII memorization is a dynamic property of a model that evolves throughout training pipelines and depends on commonly altered design choices. We characterize three such novel phenomena: (1) similar-appearing PII seen later in training can elicit memorization of earlier-seen sequences in what we call assisted memorization, and this is a significant factor (in our settings, up to 1/3); (2) adding PII can increase memorization of other PII significantly (in our settings, as much as $approx!7.5 imes$); and (3) removing PII can lead to other PII being memorized. Model creators should consider these first- and second-order privacy risks when training models to avoid the risk of new PII regurgitation.

Problem

Research questions and friction points this paper is trying to address.

Dynamic PII memorization in LLMs

Impact of PII addition on privacy

Consequences of PII removal in training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic PII memorization in LLMs

Assisted memorization of PII sequences

Second-order privacy risks in training

🔎 Similar Papers

Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions