Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training

📅 2025-02-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work exposes dynamic privacy cascades in large language model (LLM) training triggered by the insertion or removal of personally identifiable information (PII). Challenging the conventional static unlearning assumption, we identify and empirically validate three novel ripple effects: auxiliary memory activation, additive amplification of memory, and deletion-induced reverse enhancement. Through controlled PII injection/removal experiments, sequence-level memory quantification, training trajectory analysis, and multi-stage fine-tuning comparisons, we find that: (i) later-occurring semantically similar PII reactivates earlier-sequence memories (up to one-third of cases); (ii) inserting PII amplifies memory strength for other PII instances by up to 7.5×; and (iii) deleting PII paradoxically strengthens memory retention of remaining PII. We are the first to characterize the dynamic coupling mechanism governing PII memorization—driven by data scheduling, temporal evolution, and distributional shift—and reveal synergistic amplification between first-order (direct) and second-order (indirect) privacy risks.

Technology Category

Application Category

📝 Abstract
Due to the sensitive nature of personally identifiable information (PII), its owners may have the authority to control its inclusion or request its removal from large-language model (LLM) training. Beyond this, PII may be added or removed from training datasets due to evolving dataset curation techniques, because they were newly scraped for retraining, or because they were included in a new downstream fine-tuning stage. We find that the amount and ease of PII memorization is a dynamic property of a model that evolves throughout training pipelines and depends on commonly altered design choices. We characterize three such novel phenomena: (1) similar-appearing PII seen later in training can elicit memorization of earlier-seen sequences in what we call assisted memorization, and this is a significant factor (in our settings, up to 1/3); (2) adding PII can increase memorization of other PII significantly (in our settings, as much as $approx!7.5 imes$); and (3) removing PII can lead to other PII being memorized. Model creators should consider these first- and second-order privacy risks when training models to avoid the risk of new PII regurgitation.
Problem

Research questions and friction points this paper is trying to address.

Dynamic PII memorization in LLMs
Impact of PII addition on privacy
Consequences of PII removal in training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic PII memorization in LLMs
Assisted memorization of PII sequences
Second-order privacy risks in training
🔎 Similar Papers
No similar papers found.