The Elusive Pursuit of Replicating PATE-GAN: Benchmarking, Auditing, Debugging

📅 2024-06-20
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically evaluates the reproducibility, utility, and privacy guarantees of six open-source PATE-GAN implementations, uncovering pervasive deficiencies: none achieve the utility (e.g., classification accuracy, F1-score) reported in the original paper, and all violate differential privacy—exhibiting 19 violations (e.g., privacy budget over-expenditure, misuse of noise mechanisms) and five functional bugs. It presents the first end-to-end audit of PATE-GAN, integrating formal differential privacy verification, GAN architecture analysis, empirical benchmarking, and code-level vulnerability detection. Key contributions include: (1) the first reproducible PATE-GAN auditing framework; (2) a dual-dimensional evaluation protocol jointly assessing privacy loss and statistical utility; and (3) an open-source auditing toolchain and standardized test suite, establishing methodological foundations and practical benchmarks for trustworthy deployment of differentially private generative models.

Technology Category

Application Category

📝 Abstract
Synthetic data created by differentially private (DP) generative models is increasingly used in real-world settings. In this context, PATE-GAN has emerged as one of the most popular algorithms, combining Generative Adversarial Networks (GANs) with the private training approach of PATE (Private Aggregation of Teacher Ensembles). In this paper, we set out to reproduce the utility evaluation from the original PATE-GAN paper, compare available implementations, and conduct a privacy audit. More precisely, we analyze and benchmark six open-source PATE-GAN implementations, including three by (a subset of) the original authors. First, we shed light on architecture deviations and empirically demonstrate that none reproduce the utility performance reported in the original paper. We then present an in-depth privacy evaluation, which includes DP auditing, and show that all implementations leak more privacy than intended. Furthermore, we uncover 19 privacy violations and 5 other bugs in these six open-source implementations. Lastly, our codebase is available from: https://github.com/spalabucr/pategan-audit.
Problem

Research questions and friction points this paper is trying to address.

Replicating utility performance of PATE-GAN
Auditing privacy in open-source implementations
Identifying privacy violations and bugs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reproduces PATE-GAN utility evaluation
Benchmarks six open-source implementations
Conducts privacy audit and DP auditing
🔎 Similar Papers
No similar papers found.