🤖 AI Summary
This study investigates the impact of Batch Normalization (BN) on the memorization of atypical samples and associated privacy risks in deep neural networks. Through comprehensive analyses—including empirical evaluation of atypical sample memorization, gradient norm characterization, membership inference attacks, and theoretical reasoning—the work demonstrates for the first time that BN significantly amplifies a model’s tendency to memorize anomalous data, thereby exacerbating privacy vulnerabilities. Extensive experiments across multiple datasets and network architectures consistently show that models employing BN not only exhibit stronger memorization of out-of-distribution or corrupted samples but are also more susceptible to membership inference attacks. These findings reveal that while BN enhances model performance, it concurrently introduces non-negligible privacy hazards that warrant careful consideration in practice.
📝 Abstract
Batch Normalization (BN) is widely adopted to enable faster convergence and more stable training of deep neural networks. However, its impact on privacy and memorization has remained largely unexplored. In this work, we investigate the effect of BN layers on the memorization of atypical or outlier samples and its implications for privacy leakage. We conduct an extensive empirical study using three complementary approaches: (i) unintended memorization of out-of-distribution training samples, (ii) per-sample influence measured via gradient norms, and (iii) susceptibility to membership inference attacks (MIA). Across multiple datasets and architectures, we consistently observe that BN substantially increases the memorization of outliers compared to models without BN. Critically, this amplified memorization translates directly into privacy vulnerabilities: models with BN exhibit significantly higher susceptibility to MIAs. We complement our empirical findings with a theoretical analysis showing that BN amplifies the per-step influence of outlier samples during training, providing mechanistic insight into this phenomenon. Our results highlight an underappreciated privacy risk associated with BN and provide both practical and theoretical insights into how normalization layers can amplify the influence of rare or sensitive training examples.