🤖 AI Summary
Membership inference attacks (MIAs) against diffusion models commonly rely on the overfitting assumption, rendering them ineffective against strongly regularized models. This paper proposes PFAMI—the first black-box MIA framework for diffusion models that does not require overfitting and instead leverages memory-induced probabilistic trends. Its core innovation lies in modeling systematic fluctuations of generation probabilities within local neighborhoods of training samples, achieved via probabilistic density estimation, neighborhood sampling, and statistical significance testing—enabling precise adaptation to diffusion model output characteristics. Experiments across multiple diffusion models and datasets demonstrate that PFAMI achieves an average 27.9% improvement in attack success rate over state-of-the-art baselines. Crucially, it maintains high robustness even against strongly regularized models with weak memorization. PFAMI thus establishes a novel paradigm for privacy evaluation of generative models.
📝 Abstract
Membership Inference Attack (MIA) identifies whether a record exists in a machine learning model's training set by querying the model. MIAs on the classic classification models have been well-studied, and recent works have started to explore how to transplant MIA onto generative models. Our investigation indicates that existing MIAs designed for generative models mainly depend on the overfitting in target models. However, overfitting can be avoided by employing various regularization techniques, whereas existing MIAs demonstrate poor performance in practice. Unlike overfitting, memorization is essential for deep learning models to attain optimal performance, making it a more prevalent phenomenon. Memorization in generative models leads to an increasing trend in the probability distribution of generating records around the member record. Therefore, we propose a Probabilistic Fluctuation Assessing Membership Inference Attack (PFAMI), a black-box MIA that infers memberships by detecting these trends via analyzing the overall probabilistic fluctuations around given records. We conduct extensive experiments across multiple generative models and datasets, which demonstrate PFAMI can improve the attack success rate (ASR) by about 27.9% when compared with the best baseline.