🤖 AI Summary
Language model memory is often oversimplified as a homogeneous phenomenon, neglecting sample-specific characteristics and heterogeneous model–corpus interactions.
Method: We propose the *memory heterogeneity hypothesis*, decomposing memory into three distinct types: *parroting* (highly repetitive sequences), *reconstructing* (highly predictable sequences), and *reminiscing* (low-repetition, low-predictability sequences). Leveraging causal-inspired feature engineering, we develop a category-aware, multi-factor logistic regression model for interpretable, cross-category attribution.
Contribution/Results: This work establishes the first memory taxonomy driven jointly by sample attributes and model–corpus co-adaptation. We identify distinct dominant factors per type—e.g., repetition rate for parroting, local entropy for reminiscing—and achieve an AUC of 0.89, significantly outperforming homogeneous baselines. Our framework enables fine-grained, mechanistic understanding of memorization behavior in large language models.
📝 Abstract
Memorization in language models is typically treated as a homogenous phenomenon, neglecting the specifics of the memorized data. We instead model memorization as the effect of a set of complex factors that describe each sample and relate it to the model and corpus. To build intuition around these factors, we break memorization down into a taxonomy: recitation of highly duplicated sequences, reconstruction of inherently predictable sequences, and recollection of sequences that are neither. We demonstrate the usefulness of our taxonomy by using it to construct a predictive model for memorization. By analyzing dependencies and inspecting the weights of the predictive model, we find that different factors influence the likelihood of memorization differently depending on the taxonomic category.