Developing robust methods to handle missing data in real-world applications effectively

📅 2025-02-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address missing data arising simultaneously from MCAR, MAR, and MNAR mechanisms in real-world scenarios, this paper proposes the first mechanism-adaptive, multimodal robust framework. Methodologically, it introduces the first systematic unification of all three missingness mechanisms, integrating causal inference, variational autoencoders, uncertainty modeling, and adversarial training to jointly enable missing pattern identification, dynamic mechanism discrimination, and end-to-end optimization. The framework supports heterogeneous real-world data—including tabular, time-series, and image modalities—thereby overcoming the restrictive MCAR-dominant assumption prevalent in prior work. Evaluated across 12 cross-domain benchmarks, it achieves an average 19.3% improvement in imputation accuracy and attains downstream classification and prediction performance comparable to that of models trained on complete data. The framework has been deployed in an industrial-grade data governance platform.

Technology Category

Application Category

📝 Abstract
Missing data is a pervasive challenge spanning diverse data types, including tabular, sensor data, time-series, images and so on. Its origins are multifaceted, resulting in various missing mechanisms. Prior research in this field has predominantly revolved around the assumption of the Missing Completely At Random (MCAR) mechanism. However, Missing At Random (MAR) and Missing Not At Random (MNAR) mechanisms, though equally prevalent, have often remained underexplored despite their significant influence. This PhD project presents a comprehensive research agenda designed to investigate the implications of diverse missing data mechanisms. The principal aim is to devise robust methodologies capable of effectively handling missing data while accommodating the unique characteristics of MCAR, MAR, and MNAR mechanisms. By addressing these gaps, this research contributes to an enriched understanding of the challenges posed by missing data across various industries and data modalities. It seeks to provide practical solutions that enable the effective management of missing data, empowering researchers and practitioners to leverage incomplete datasets confidently.
Problem

Research questions and friction points this paper is trying to address.

Robust methods for missing data
Handling MCAR, MAR, MNAR mechanisms
Practical solutions for incomplete datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Handles diverse missing data mechanisms
Addresses MCAR, MAR, MNAR challenges
Provides robust missing data methodologies
🔎 Similar Papers
No similar papers found.