Measuring memorization through probabilistic discoverable extraction

📅 2024-10-25

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

Large language models (LLMs) risk privacy leakage through memorization of training data, yet existing discoverable extraction methods yield only binary judgments and lack reliability under non-greedy sampling. This paper introduces the first probabilistic extractability metric, which quantifies the likelihood of generating a target sequence by modeling the model’s output distribution via repeated stochastic sampling—varying temperature, top-k, and nucleus (top-p) parameters. Moving beyond single-sequence greedy decoding, our approach reveals that conventional methods systematically underestimate memorization rates by an average factor of 3.2×. Empirical analysis demonstrates that memorization exposure is nonlinearly modulated by sampling parameters, with temperature and top-p exerting significant influence on risk magnitude. Robust evaluation across multiple models, scales, and datasets validates the metric’s accuracy, reliability, and interpretability—establishing a refined, principled benchmark for quantifying LLM memorization risks.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are susceptible to memorizing training data, raising concerns due to the potential extraction of sensitive information. Current methods to measure memorization rates of LLMs, primarily discoverable extraction (Carlini et al., 2022), rely on single-sequence greedy sampling, potentially underestimating the true extent of memorization. This paper introduces a probabilistic relaxation of discoverable extraction that quantifies the probability of extracting a target sequence within a set of generated samples, considering various sampling schemes and multiple attempts. This approach addresses the limitations of reporting memorization rates through discoverable extraction by accounting for the probabilistic nature of LLMs and user interaction patterns. Our experiments demonstrate that this probabilistic measure can reveal cases of higher memorization rates compared to rates found through discoverable extraction. We further investigate the impact of different sampling schemes on extractability, providing a more comprehensive and realistic assessment of LLM memorization and its associated risks. Our contributions include a new probabilistic memorization definition, empirical evidence of its effectiveness, and a thorough evaluation across different models, sizes, sampling schemes, and training data repetitions.

Problem

Research questions and friction points this paper is trying to address.

Measure memorization in language models using probabilistic extraction.

Address unreliability of traditional discoverable extraction methods.

Quantify extraction risk across models and sampling schemes.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces probabilistic discoverable extraction method

Considers multiple queries for extraction probability

Evaluates across models, sampling schemes, repetitions

🔎 Similar Papers

Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon