Identifying Memorization of Diffusion Models through p-Laplace Analysis

πŸ“… 2025-05-13
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the problem of latent training data memorization in diffusion models. We propose a novel method for detecting memorized samples by computing the p-Laplacian operator on the score functionβ€”a first application of p-Laplacian analysis to memorization studies in diffusion models. Our approach establishes a theoretical connection between the score function and higher-order curvature structures of the probability manifold. Leveraging numerical approximation of the p-Laplacian, theoretical derivation under Gaussian mixture models, and empirical validation on real-world image generation models, we successfully identify memorized samples with significant improvements over existing baselines. A key finding is that memorized regions exhibit pronounced curvature anomalies in the score gradient field, revealing an interpretable geometric signature of implicit memorization. This provides both new theoretical insight into the memorization mechanism of diffusion models and a principled, geometry-based diagnostic tool for detecting memorization.

Technology Category

Application Category

πŸ“ Abstract
Diffusion models, today's leading image generative models, estimate the score function, i.e. the gradient of the log probability of (perturbed) data samples, without direct access to the underlying probability distribution. This work investigates whether the estimated score function can be leveraged to compute higher-order differentials, namely p-Laplace operators. We show here these operators can be employed to identify memorized training data. We propose a numerical p-Laplace approximation based on the learned score functions, showing its effectiveness in identifying key features of the probability landscape. We analyze the structured case of Gaussian mixture models, and demonstrate the results carry-over to image generative models, where memorization identification based on the p-Laplace operator is performed for the first time.
Problem

Research questions and friction points this paper is trying to address.

Identifying memorized training data in diffusion models
Computing p-Laplace operators from score functions
Analyzing probability landscapes in image generative models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses p-Laplace operators for memorization identification
Approximates p-Laplace via learned score functions
Applies method to Gaussian and image models
πŸ”Ž Similar Papers
No similar papers found.