🤖 AI Summary
Gradient descent in model training risks privacy leakage upon model release, particularly when gradients reveal sensitive information about training data.
Method: This paper proposes a universal privacy auditing metric grounded in gradient uniqueness, derived from information-theoretic principles to establish a mathematically rigorous upper bound on privacy leakage. Unlike existing approaches, it is agnostic to model architecture, data modality, or adversarial assumptions, and operates via a lightweight monitoring mechanism—without injecting noise, thereby diverging from noise-based defenses such as DP-SGD.
Contribution/Results: Experimental evaluation demonstrates that the method achieves privacy guarantees comparable to DP-SGD while significantly improving test accuracy and practical utility. It enables quantitative, real-time assessment of privacy risk during training and supports controllable, privacy-aware model publishing—offering a novel paradigm for balancing privacy preservation and model performance in deep learning.
📝 Abstract
Disclosing private information via publication of a machine learning model is often a concern. Intuitively, publishing a learned model should be less risky than publishing a dataset. But how much risk is there? In this paper, we present a principled disclosure metric called emph{gradient uniqueness} that is derived from an upper bound on the amount of information disclosure from publishing a learned model. Gradient uniqueness provides an intuitive way to perform privacy auditing. The mathematical derivation of gradient uniqueness is general, and does not make any assumption on the model architecture, dataset type, or the strategy of an attacker. We examine a simple defense based on monitoring gradient uniqueness, and find that it achieves privacy comparable to classical methods such as DP-SGD, while being substantially better in terms of (utility) testing accuracy.