Conformal Inference For Missing Data under Multiple Robust Learning

📅 2025-10-20

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This paper addresses the challenge of conformal inference under missing-at-random (MAR) data. We propose CM–MRL, a novel method that integrates multiply robust empirical likelihood reweighting with split conformal calibration to simultaneously guarantee marginal and conditional coverage. CM–MRL performs distributional calibration of the score function on fully observed samples, ensuring valid coverage even when some working models are misspecified. We establish theoretical guarantees: the estimator is asymptotically normal, and the coverage error converges to zero uniformly; empirical process analysis further substantiates its robustness. Numerical experiments demonstrate that CM–MRL consistently outperforms existing methods across diverse missingness mechanisms—achieving higher coverage accuracy and shorter prediction intervals. By unifying robust estimation with conformal prediction, CM–MRL establishes a new paradigm for uncertainty quantification in missing-data settings, offering both rigorous theoretical foundations and practical advantages.

Technology Category

Application Category

📝 Abstract

We develop a novel approach to tackle the common but challenging problem of conformal inference for missing data in machine learning, focusing on Missing at Random (MAR) data. We propose a new procedure Conformal prediction for Missing data under Multiple Robust Learning (CM--MRL) that combines split conformal calibration with a multiple robust empirical-likelihood (EL) reweighting scheme. The method proceeds via a double calibration by reweighting the complete-case scores by EL so that their distribution matches the full calibration distribution implied by MAR, even when some working models are misspecified. We demonstrate the asymptotic behavior of our estimators through empirical process theory and provide reliable coverage for our prediction intervals, both marginally and conditionally and we further show an interval-length dominance result. We show the effectiveness of the proposed method by several numerical experiments in the presence of missing data.

Problem

Research questions and friction points this paper is trying to address.

Develops conformal inference for missing at random data

Combines split conformal calibration with robust reweighting

Ensures reliable prediction intervals under model misspecification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multiple robust learning for missing data

Double calibration with empirical-likelihood reweighting

Split conformal prediction with robust coverage

🔎 Similar Papers

FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Contrastive Regularization