Undesirable Memorization in Large Language Models: A Survey

📅 2024-10-03
🏛️ arXiv.org
📈 Citations: 4
Influential: 1
📄 PDF
🤖 AI Summary
Large language models (LLMs) exhibit “undesired memorization”—the excessive retention and leakage of sensitive training data fragments—posing critical risks for privacy violations and membership inference attacks. Method: We propose the first three-dimensional taxonomy (granularity, retrievability, extractability) to systematically characterize this phenomenon; establish a privacy–utility trade-off analysis framework unifying exposure, membership inference, and related metrics; and extend the study to emerging paradigms including retrieval-augmented generation (RAG) and diffusion language models. Through systematic literature review, empirical attribution analysis, and defense evaluation, we construct a structured knowledge graph and an open-source, dynamically updated literature repository. Results: Our work identifies six key frontiers in LLM memorization governance, advancing the field from ad hoc practice toward rigorous, systematized science.

Technology Category

Application Category

📝 Abstract
While recent research increasingly showcases the remarkable capabilities of Large Language Models (LLMs), it is equally crucial to examine their associated risks. Among these, privacy and security vulnerabilities are particularly concerning, posing significant ethical and legal challenges. At the heart of these vulnerabilities stands memorization, which refers to a model's tendency to store and reproduce phrases from its training data. This phenomenon has been shown to be a fundamental source to various privacy and security attacks against LLMs. In this paper, we provide a taxonomy of the literature on LLM memorization, exploring it across three dimensions: granularity, retrievability, and desirability. Next, we discuss the metrics and methods used to quantify memorization, followed by an analysis of the causes and factors that contribute to memorization phenomenon. We then explore strategies that are used so far to mitigate the undesirable aspects of this phenomenon. We conclude our survey by identifying potential research topics for the near future, including methods to balance privacy and performance, and the analysis of memorization in specific LLM contexts such as conversational agents, retrieval-augmented generation, and diffusion language models. Given the rapid research pace in this field, we also maintain a dedicated repository of the references discussed in this survey which will be regularly updated to reflect the latest developments.
Problem

Research questions and friction points this paper is trying to address.

Examines privacy and security risks in Large Language Models
Analyzes memorization as a source of vulnerabilities in LLMs
Explores mitigation strategies for undesirable memorization effects
Innovation

Methods, ideas, or system contributions that make the work stand out.

Taxonomy of LLM memorization across dimensions
Metrics and methods to quantify memorization
Strategies to mitigate undesirable memorization effects
🔎 Similar Papers