🤖 AI Summary
Score-based membership inference attacks (MIAs) against diffusion models pose severe threats to training data privacy. Method: We observe that the output norm of the noise predictor encodes local density relationships between a sample and the training set; leveraging this insight, we propose SimA—a single-query MIA—that establishes, for the first time, a theoretical connection between score estimation and local density. We further analyze latent diffusion models (LDMs) through the lens of information bottlenecks and demonstrate their inherent robustness against MIAs. To enhance privacy, we strengthen β-VAE latent regularization to suppress feature leakage from training data. Contribution/Results: SimA achieves high efficacy across diverse architectures—including DDPM and LDM—validated both theoretically and empirically. Our work provides a novel theoretical framework and practical tool for privacy risk assessment in diffusion models, advancing the understanding of privacy–performance trade-offs in generative modeling.
📝 Abstract
Membership inference attacks (MIAs) against diffusion models have emerged as a pressing privacy concern, as these models may inadvertently reveal whether a given sample was part of their training set. We present a theoretical and empirical study of score-based MIAs, focusing on the predicted noise vectors that diffusion models learn to approximate. We show that the expected denoiser output points toward a kernel-weighted local mean of nearby training samples, such that its norm encodes proximity to the training set and thereby reveals membership. Building on this observation, we propose SimA, a single-query attack that provides a principled, efficient alternative to existing multi-query methods. SimA achieves consistently strong performance across variants of DDPM, Latent Diffusion Model (LDM). Notably, we find that Latent Diffusion Models are surprisingly less vulnerable than pixel-space models, due to the strong information bottleneck imposed by their latent auto-encoder. We further investigate this by differing the regularization hyperparameters ($β$ in $β$-VAE) in latent channel and suggest a strategy to make LDM training more robust to MIA. Our results solidify the theory of score-based MIAs, while highlighting that Latent Diffusion class of methods requires better understanding of inversion for VAE, and not simply inversion of the Diffusion process