🤖 AI Summary
Deep generative models pose privacy and compliance risks due to unintended memorization of training data. To address this, we propose the Manifold Memorization Hypothesis (MMH), establishing the first geometric unifying framework for memorization—grounded in the dimensional relationship between data manifolds and learned model manifolds. MMH formally defines memorization strength and rigorously distinguishes two distinct mechanisms: overfitting-driven memorization and distribution-driven memorization. Through manifold dimension estimation, synthetic data modeling, and systematic empirical evaluation on large-scale image models—including Stable Diffusion—we validate MMH’s explanatory power for observed memorization phenomena. Furthermore, we develop scalable memorization detection and suppression methods, demonstrating their effectiveness on both synthetic and real-world image datasets. This work provides both a theoretical foundation and practical tools for enhancing privacy safety in generative AI systems.
📝 Abstract
As deep generative models have progressed, recent work has shown them to be capable of memorizing and reproducing training datapoints when deployed. These findings call into question the usability of generative models, especially in light of the legal and privacy risks brought about by memorization. To better understand this phenomenon, we propose the manifold memorization hypothesis (MMH), a geometric framework which leverages the manifold hypothesis into a clear language in which to reason about memorization. We propose to analyze memorization in terms of the relationship between the dimensionalities of (i) the ground truth data manifold and (ii) the manifold learned by the model. This framework provides a formal standard for"how memorized"a datapoint is and systematically categorizes memorized data into two types: memorization driven by overfitting and memorization driven by the underlying data distribution. By analyzing prior work in the context of the MMH, we explain and unify assorted observations in the literature. We empirically validate the MMH using synthetic data and image datasets up to the scale of Stable Diffusion, developing new tools for detecting and preventing generation of memorized samples in the process.