Membership Inference Attacks Beyond Overfitting

📅 2025-11-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study identifies the fundamental cause of membership inference attacks (MIAs) persisting in non-overfitted models: intra-class anomalous samples—such as noisy or hard-to-classify instances—are disproportionately memorized by models, rendering them more susceptible to membership leakage and challenging the conventional view that MIAs stem solely from overfitting. Method: We develop an analytical framework integrating model output divergence, intra-class distance metrics, and anomaly detection to systematically pinpoint data-level vulnerability sources. Contribution/Results: We propose a sample-aware defense mechanism specifically targeting vulnerable instances, shifting privacy-preserving paradigms from model regularization toward data-characteristic awareness. Experiments demonstrate that even models with strong generalization performance can significantly leak membership information for specific training samples. Our open-sourced code enables reproducible privacy research.

Technology Category

Application Category

📝 Abstract
Membership inference attacks (MIAs) against machine learning (ML) models aim to determine whether a given data point was part of the model training data. These attacks may pose significant privacy risks to individuals whose sensitive data were used for training, which motivates the use of defenses such as differential privacy, often at the cost of high accuracy losses. MIAs exploit the differences in the behavior of a model when making predictions on samples it has seen during training (members) versus those it has not seen (non-members). Several studies have pointed out that model overfitting is the major factor contributing to these differences in behavior and, consequently, to the success of MIAs. However, the literature also shows that even non-overfitted ML models can leak information about a small subset of their training data. In this paper, we investigate the root causes of membership inference vulnerabilities beyond traditional overfitting concerns and suggest targeted defenses. We empirically analyze the characteristics of the training data samples vulnerable to MIAs in models that are not overfitted (and hence able to generalize). Our findings reveal that these samples are often outliers within their classes (e.g., noisy or hard to classify). We then propose potential defensive strategies to protect these vulnerable samples and enhance the privacy-preserving capabilities of ML models. Our code is available at https://github.com/najeebjebreel/mia_analysis.
Problem

Research questions and friction points this paper is trying to address.

Investigating root causes of membership inference vulnerabilities beyond overfitting
Analyzing characteristics of vulnerable training samples in generalized models
Proposing defensive strategies to protect outliers and enhance privacy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzing membership inference vulnerabilities beyond overfitting
Identifying outliers as vulnerable training data samples
Proposing defensive strategies to protect vulnerable samples
🔎 Similar Papers
No similar papers found.