Privacy Amplification by Missing Data

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work challenges the conventional view of missing data as a defect in high-sensitivity domains such as healthcare and finance, proposing instead that it can serve as a mechanism for privacy enhancement. For the first time, we formally model missing data within the differential privacy framework and theoretically demonstrate that, under specific missingness mechanisms—such as missing completely at random—incomplete datasets can effectively amplify privacy guarantees and enhance overall privacy protection. By establishing this novel perspective, our study reframes missing data not as a liability but as a strategic asset, offering a new paradigm and rigorous theoretical foundation for privacy-preserving data analysis.

Technology Category

Application Category

📝 Abstract
Privacy preservation is a fundamental requirement in many high-stakes domains such as medicine and finance, where sensitive personal data must be analyzed without compromising individual confidentiality. At the same time, these applications often involve datasets with missing values due to non-response, data corruption, or deliberate anonymization. Missing data is traditionally viewed as a limitation because it reduces the information available to analysts and can degrade model performance. In this work, we take an alternative perspective and study missing data from a privacy preservation standpoint. Intuitively, when features are missing, less information is revealed about individuals, suggesting that missingness could inherently enhance privacy. We formalize this intuition by analyzing missing data as a privacy amplification mechanism within the framework of differential privacy. We show, for the first time, that incomplete data can yield privacy amplification for differentially private algorithms.
Problem

Research questions and friction points this paper is trying to address.

privacy amplification
missing data
differential privacy
privacy preservation
incomplete data
Innovation

Methods, ideas, or system contributions that make the work stand out.

privacy amplification
missing data
differential privacy
privacy preservation
incomplete data
🔎 Similar Papers
No similar papers found.
Simon Roburin
Simon Roburin
Postdoctoral researcher Sorbonne Université
Machine LearningDeep LearningComputer VisionOptimization
R
Rafael Pinot
Sorbonne Université, Université Paris Cité, CNRS, Laboratoire de Probabilités, Statistique et Modélisation, LPSM, F-75005 Paris, France
Erwan Scornet
Erwan Scornet
Professeur, Sorbonne Université
StatistiqueMachine Learning