π€ AI Summary
This work investigates how dataset characteristics affect vulnerability to membership inference attacks (MIAs) in deep transfer learning, particularly for non-differentially private (non-DP) fine-tuned models.
Method: We systematically quantify the impact of dataset size and per-class sample count on MIA success rates through empirical evaluation and theoretical modeling based on a simplified fine-tuning process.
Contribution/Results: We establish, for the first time, that MIA advantage decays as a power law with respect to per-class sample count under fixed false positive rateβa finding that reveals the impractically large sample requirements needed to robustly protect the most vulnerable samples. This bridges a critical gap between DP-based privacy theory and real-world MIA threat models. Our results yield actionable, quantifiable guidelines for dataset-scale design in transfer learning, significantly enhancing the interpretability and controllability of privacy risks in non-DP settings.
π Abstract
Membership inference attacks (MIAs) are used to test practical privacy of machine learning models. MIAs complement formal guarantees from differential privacy (DP) under a more realistic adversary model. We analyse MIA vulnerability of fine-tuned neural networks both empirically and theoretically, the latter using a simplified model of fine-tuning. We show that the vulnerability of non-DP models when measured as the attacker advantage at fixed false positive rate reduces according to a simple power law as the number of examples per class increases, even for the most vulnerable points, but the dataset size needed for adequate protection of the most vulnerable points is very large.