Do LLMs Really Memorize Personally Identifiable Information? Revisiting PII Leakage with a Cue-Controlled Memorization Framework

๐Ÿ“… 2026-01-07
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the prevalent issue in current evaluations of personally identifiable information (PII) leakage from large language models (LLMs), where surface-level cues in prompts are often misinterpreted as evidence of model memorization. To resolve this, the authors propose the Controlled Retrieval Memory (CRM) framework, which formally defines โ€œcue-resistant memorizationโ€ as a necessary condition for genuine memorization. Through large-scale multilingual experiments spanning 32 languages, they systematically assess memory behavior using prefix-suffix completion, associative reconstruction, cue-free generation, and membership inference. Their findings reveal a significant drop in PII reconstruction success when controlling for cues, and an extremely low true positive rate in cue-free settings, demonstrating that most apparent PII leakage stems from cue-driven generalization rather than actual memorization.

Technology Category

Application Category

๐Ÿ“ Abstract
Large Language Models (LLMs) have been reported to"leak"Personally Identifiable Information (PII), with successful PII reconstruction often interpreted as evidence of memorization. We propose a principled revision of memorization evaluation for LLMs, arguing that PII leakage should be evaluated under low lexical cue conditions, where target PII cannot be reconstructed through prompt-induced generalization or pattern completion. We formalize Cue-Resistant Memorization (CRM) as a cue-controlled evaluation framework and a necessary condition for valid memorization evaluation, explicitly conditioning on prompt-target overlap cues. Using CRM, we conduct a large-scale multilingual re-evaluation of PII leakage across 32 languages and multiple memorization paradigms. Revisiting reconstruction-based settings, including verbatim prefix-suffix completion and associative reconstruction, we find that their apparent effectiveness is driven primarily by direct surface-form cues rather than by true memorization. When such cues are controlled for, reconstruction success diminishes substantially. We further examine cue-free generation and membership inference, both of which exhibit extremely low true positive rates. Overall, our results suggest that previously reported PII leakage is better explained by cue-driven behavior than by genuine memorization, highlighting the importance of cue-controlled evaluation for reliably quantifying privacy-relevant memorization in LLMs.
Problem

Research questions and friction points this paper is trying to address.

Personally Identifiable Information
memorization
Large Language Models
PII leakage
cue-controlled evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cue-Resistant Memorization
PII leakage
large language models
privacy evaluation
memorization framework
๐Ÿ”Ž Similar Papers
No similar papers found.