Deep learning with missing data

📅 2025-04-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
For multivariate nonparametric regression with missing covariates, this paper proposes Pattern-Embedded Neural Networks (PENNs), the first deep learning framework to explicitly model missingness patterns. PENNs jointly learn representations of observation indicator vectors and imputed data features in parallel, enabling end-to-end robust prediction. Theoretically, PENNs achieve near-optimal convergence rates under arbitrary missingness mechanisms—overcoming the error accumulation bottleneck inherent in conventional “impute-then-model” paradigms. Leveraging a multi-branch architecture and approximation analysis over composite Hölder classes, we derive finite-sample excess risk bounds. Empirical evaluation across synthetic, semi-synthetic, and real-world datasets demonstrates that PENNs consistently outperform standard neural networks, reducing prediction error by 30%–70%. The implementation and tutorials are publicly available.

Technology Category

Application Category

📝 Abstract
In the context of multivariate nonparametric regression with missing covariates, we propose Pattern Embedded Neural Networks (PENNs), which can be applied in conjunction with any existing imputation technique. In addition to a neural network trained on the imputed data, PENNs pass the vectors of observation indicators through a second neural network to provide a compact representation. The outputs are then combined in a third neural network to produce final predictions. Our main theoretical result exploits an assumption that the observation patterns can be partitioned into cells on which the Bayes regression function behaves similarly, and belongs to a compositional H""older class. It provides a finite-sample excess risk bound that holds for an arbitrary missingness mechanism, and in combination with a complementary minimax lower bound, demonstrates that our PENN estimator attains in typical cases the minimax rate of convergence as if the cells of the partition were known in advance, up to a poly-logarithmic factor in the sample size. Numerical experiments on simulated, semi-synthetic and real data confirm that the PENN estimator consistently improves, often dramatically, on standard neural networks without pattern embedding. Code to reproduce our experiments, as well as a tutorial on how to apply our method, is publicly available.
Problem

Research questions and friction points this paper is trying to address.

Handling missing data in multivariate nonparametric regression
Improving prediction accuracy with pattern embedding
Achieving minimax convergence rates for missing data
Innovation

Methods, ideas, or system contributions that make the work stand out.

PENNs combine imputation and observation indicators
PENNs use three neural networks for prediction
PENNs achieve minimax rate with missing data
🔎 Similar Papers
No similar papers found.
T
Tianyi Ma
Statistical Laboratory, University of Cambridge
Tengyao Wang
Tengyao Wang
Professor in Statistics at London School of Economics
statistical theory and methodology
R
R. Samworth
Statistical Laboratory, University of Cambridge