Impact of Dataset Properties on Membership Inference Vulnerability of Deep Transfer Learning

πŸ“… 2024-02-07
πŸ“ˆ Citations: 2
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work investigates how dataset characteristics affect vulnerability to membership inference attacks (MIAs) in deep transfer learning, particularly for non-differentially private (non-DP) fine-tuned models. Method: We systematically quantify the impact of dataset size and per-class sample count on MIA success rates through empirical evaluation and theoretical modeling based on a simplified fine-tuning process. Contribution/Results: We establish, for the first time, that MIA advantage decays as a power law with respect to per-class sample count under fixed false positive rateβ€”a finding that reveals the impractically large sample requirements needed to robustly protect the most vulnerable samples. This bridges a critical gap between DP-based privacy theory and real-world MIA threat models. Our results yield actionable, quantifiable guidelines for dataset-scale design in transfer learning, significantly enhancing the interpretability and controllability of privacy risks in non-DP settings.

Technology Category

Application Category

πŸ“ Abstract
Membership inference attacks (MIAs) are used to test practical privacy of machine learning models. MIAs complement formal guarantees from differential privacy (DP) under a more realistic adversary model. We analyse MIA vulnerability of fine-tuned neural networks both empirically and theoretically, the latter using a simplified model of fine-tuning. We show that the vulnerability of non-DP models when measured as the attacker advantage at fixed false positive rate reduces according to a simple power law as the number of examples per class increases, even for the most vulnerable points, but the dataset size needed for adequate protection of the most vulnerable points is very large.
Problem

Research questions and friction points this paper is trying to address.

Analyze membership inference vulnerability in fine-tuned neural networks
Study impact of dataset size on privacy protection effectiveness
Examine power law relationship between examples per class and attack vulnerability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Empirical and theoretical MIA vulnerability analysis
Power law reduction with increased examples
Large dataset needed for vulnerable points protection
πŸ”Ž Similar Papers
No similar papers found.
Marlon Tobaben
Marlon Tobaben
PhD student, University of Helsinki
Machine LearningDeep LearningPrivacy
H
Hibiki Ito
J
Joonas Jalko
Department of Computer Science, University of Helsinki
Y
Yuan He
Department of Computer Science, Aalto University
Antti Honkela
Antti Honkela
Professor, University of Helsinki
Machine LearningDifferential PrivacyBayesian InferenceBioinformatics#UnivHelsinkiCS