Membership Inference Attacks Beyond Overfitting

📅 2025-11-20

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This study identifies the fundamental cause of membership inference attacks (MIAs) persisting in non-overfitted models: intra-class anomalous samples—such as noisy or hard-to-classify instances—are disproportionately memorized by models, rendering them more susceptible to membership leakage and challenging the conventional view that MIAs stem solely from overfitting. Method: We develop an analytical framework integrating model output divergence, intra-class distance metrics, and anomaly detection to systematically pinpoint data-level vulnerability sources. Contribution/Results: We propose a sample-aware defense mechanism specifically targeting vulnerable instances, shifting privacy-preserving paradigms from model regularization toward data-characteristic awareness. Experiments demonstrate that even models with strong generalization performance can significantly leak membership information for specific training samples. Our open-sourced code enables reproducible privacy research.

Technology Category

Application Category

📝 Abstract

Membership inference attacks (MIAs) against machine learning (ML) models aim to determine whether a given data point was part of the model training data. These attacks may pose significant privacy risks to individuals whose sensitive data were used for training, which motivates the use of defenses such as differential privacy, often at the cost of high accuracy losses. MIAs exploit the differences in the behavior of a model when making predictions on samples it has seen during training (members) versus those it has not seen (non-members). Several studies have pointed out that model overfitting is the major factor contributing to these differences in behavior and, consequently, to the success of MIAs. However, the literature also shows that even non-overfitted ML models can leak information about a small subset of their training data. In this paper, we investigate the root causes of membership inference vulnerabilities beyond traditional overfitting concerns and suggest targeted defenses. We empirically analyze the characteristics of the training data samples vulnerable to MIAs in models that are not overfitted (and hence able to generalize). Our findings reveal that these samples are often outliers within their classes (e.g., noisy or hard to classify). We then propose potential defensive strategies to protect these vulnerable samples and enhance the privacy-preserving capabilities of ML models. Our code is available at https://github.com/najeebjebreel/mia_analysis.

Problem

Research questions and friction points this paper is trying to address.

Investigating root causes of membership inference vulnerabilities beyond overfitting

Analyzing characteristics of vulnerable training samples in generalized models

Proposing defensive strategies to protect outliers and enhance privacy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzing membership inference vulnerabilities beyond overfitting

Identifying outliers as vulnerable training data samples

Proposing defensive strategies to protect vulnerable samples

🔎 Similar Papers

Fundamental Limits of Membership Inference Attacks on Machine Learning Models