Optimizing Canaries for Privacy Auditing with Metagradient Descent

📅 2025-07-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of inaccurate and inefficient lower-bound estimation of privacy parameters in black-box auditing of differential privacy (DP) algorithms. We propose a canary sample set optimization method based on meta-gradient descent. By formulating membership inference attacks as a meta-learning task, our approach enhances the estimation of the privacy budget ε for mechanisms such as DP-SGD. We initialize canary samples via non-private SGD pretraining and enable cross-model and cross-training-algorithm transferability—canaries optimized on small models effectively generalize to large-scale DP-SGD models. In image classification benchmarks, our method improves the accuracy of privacy lower-bound estimation by over 2×, significantly boosting audit robustness and generalizability. This establishes a new, efficient, and transferable paradigm for automated privacy verification of DP systems.

Technology Category

Application Category

📝 Abstract
In this work we study black-box privacy auditing, where the goal is to lower bound the privacy parameter of a differentially private learning algorithm using only the algorithm's outputs (i.e., final trained model). For DP-SGD (the most successful method for training differentially private deep learning models), the canonical approach auditing uses membership inference-an auditor comes with a small set of special "canary" examples, inserts a random subset of them into the training set, and then tries to discern which of their canaries were included in the training set (typically via a membership inference attack). The auditor's success rate then provides a lower bound on the privacy parameters of the learning algorithm. Our main contribution is a method for optimizing the auditor's canary set to improve privacy auditing, leveraging recent work on metagradient optimization. Our empirical evaluation demonstrates that by using such optimized canaries, we can improve empirical lower bounds for differentially private image classification models by over 2x in certain instances. Furthermore, we demonstrate that our method is transferable and efficient: canaries optimized for non-private SGD with a small model architecture remain effective when auditing larger models trained with DP-SGD.
Problem

Research questions and friction points this paper is trying to address.

Optimizing canary sets for better privacy auditing
Lower bounding DP-SGD privacy via metagradient descent
Improving empirical privacy bounds for image classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimizing canaries via metagradient descent
Enhancing privacy auditing with improved canaries
Transferable canaries for diverse model audits
🔎 Similar Papers
No similar papers found.