How much do language models memorize?

📅 2025-05-30

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the fundamental distinction between unintentional memorization and generalization in language models. We propose the first formal theoretical framework that employs counterfactual control of generalization to precisely quantify unintentional memorization capacity, and design a membership inference–based evaluation paradigm validated through large-scale parameter sweeps (500K–1.5B) and empirical Transformer experiments. Key contributions include: (1) identification of a saturation point in unintentional memorization, which triggers the “grokking” phenomenon; (2) estimation of memorization capacity in GPT-style models at approximately 3.6 bits per parameter; and (3) derivation of novel scaling laws linking memorization–generalization trade-offs, training data scale, and membership inference risk. These results establish a rigorous theoretical foundation and empirical benchmark for modeling model memorization, assessing privacy risks, and understanding training dynamics.

Technology Category

Application Category

📝 Abstract

We propose a new method for estimating how much a model ``knows'' about a datapoint and use it to measure the capacity of modern language models. Prior studies of language model memorization have struggled to disentangle memorization from generalization. We formally separate memorization into two components: extit{unintended memorization}, the information a model contains about a specific dataset, and extit{generalization}, the information a model contains about the true data-generation process. When we completely eliminate generalization, we can compute the total memorization, which provides an estimate of model capacity: our measurements estimate that GPT-style models have a capacity of approximately 3.6 bits per parameter. We train language models on datasets of increasing size and observe that models memorize until their capacity fills, at which point ``grokking'' begins, and unintended memorization decreases as models begin to generalize. We train hundreds of transformer language models ranging from $500K$ to $1.5B$ parameters and produce a series of scaling laws relating model capacity and data size to membership inference.

Problem

Research questions and friction points this paper is trying to address.

Measure language models' memorization capacity per parameter

Distinguish unintended memorization from true generalization

Explore scaling laws linking model size, data, and memorization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Estimates model knowledge via memorization measurement

Separates memorization into unintended and generalization components

Trains transformers to derive scaling laws for capacity

🔎 Similar Papers

Undesirable Memorization in Large Language Models: A Survey

2024-10-03arXiv.orgCitations: 4

Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data

2024-07-20arXiv.orgCitations: 16

Microsoft

$119,800 -

San Francisco Bay area / New York City metropolitan area

Authors to Follow