Cellwise and Casewise Robust Multivariate Regression with Inference

📅 2026-05-08
📈 Citations: 0
Influential: 0
📄 PDF

career value

197K/year
🤖 AI Summary
This study addresses the lack of robustness in multivariate linear regression under high-dimensional settings with missing data and coexisting cell-wise and row-wise outliers. The authors propose the cellMR estimator, which uniquely integrates row-wise robust covariance estimation with ridge regularization to simultaneously handle both types of outliers while accommodating missingness and high dimensionality. They further develop cellBoot, a novel bootstrap procedure based on indirect inference, to enable asymptotically valid and robust statistical inference against both contamination mechanisms. Theoretical results establish the bounded influence function of the estimator and the asymptotic validity of cellBoot confidence intervals. Extensive simulations and genomic applications demonstrate that the proposed method substantially outperforms existing approaches in finite samples.
📝 Abstract
Multivariate linear regression is a fundamental statistical task, but classical estimators such as ordinary least squares are highly sensitive to outliers. These may occur as casewise outliers that affect entire observations, or as outlying cells, that are individual contaminated entries in the predictor and/or response matrix. Moreover, modern datasets frequently contain missing values and are high-dimensional. To address these challenges we propose the cellwise multivariate regression (cellMR) estimator, a robust regression method that simultaneously accommodates casewise and cellwise outliers, missing data, and high dimensionality. The approach builds on a cellwise robust covariance estimator and uses ridge regularization for numerical stability. We further introduce cellBoot, a novel bootstrap-based inference procedure tailored to the cellMR framework. Relying on indirect inference, cellBoot provides asymptotically valid confidence intervals that are robust to casewise and cellwise contamination. We derive influence functions of the regression estimator and prove the asymptotic validity of the cellBoot confidence intervals. Simulations and a real genomics application illustrate the strong finite-sample performance of the proposed methods.
Problem

Research questions and friction points this paper is trying to address.

multivariate regression
outliers
missing data
high dimensionality
robust inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

cellwise robustness
multivariate regression
ridge regularization
bootstrap inference
high-dimensional data
🔎 Similar Papers
No similar papers found.