Cellwise Robust Twoblock Dimension Reduction

📅 2026-04-16
📈 Citations: 0
Influential: 0
📄 PDF

career value

203K/year
🤖 AI Summary
This study addresses the challenge that traditional robust methods struggle with contamination scattered across individual cells in multivariate data. The authors propose a cell-wise robust two-block dimension reduction approach that identifies anomalous cells through column-wise prescreening and performs model-driven imputation within an iteratively reweighted M-estimation framework, thereby preserving useful information from partially contaminated rows. This method is the first to achieve joint robust dimension reduction for both predictor and response blocks in a multivariate setting, supporting both dense and sparse variable selection. It maintains high robustness and efficiency even when more than 50% of the rows contain contaminated cells. Empirical results demonstrate its ability to accurately recover the true outlier structure and informative variable sets, yielding highly interpretable outcomes in real-world applications.

Technology Category

Application Category

📝 Abstract
Cellwise Robust Twoblock (CRTB) is introduced, the first cellwise robust method for simultaneous dimension reduction of multivariate predictor and response blocks, in both a dense and a sparse variable-selecting variant. Classical robust methods protect against casewise outliers by downweighting or removing entire observations, a strategy that becomes inefficient -- and eventually breaks down -- when contamination is scattered across individual cells rather than concentrated in whole rows. CRTB combines a column-wise pre-filter for cellwise outlier detection with model-based imputation of flagged cells inside an iteratively reweighted M-estimation loop, retaining the clean cells of partially contaminated rows instead of discarding the observation. An efficient algorithm is provided that uses the classical twoblock SVD as a warm start and converges in a handful of IRLS iterations at a moderate computational cost. The method resists settings where more than $50\%$ of rows contain contaminated cells while retaining comparable efficiency on clean data. A simulation study confirms these properties and shows that CRTB additionally recovers the underlying cellwise outlier pattern with high fidelity and, in the sparse setting, the correct set of informative variables. Two compelling examples illustrate CRTB's practical utility. In each of these, CRTB is shown to be conducive to results that are highly interpretable in the respective domains in the presence of cellwise outliers. As a by-product, the corresponding cells are identified with high fidelity.
Problem

Research questions and friction points this paper is trying to address.

cellwise outliers
dimension reduction
robust statistics
multivariate data
outlier detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

cellwise robustness
twoblock dimension reduction
iteratively reweighted M-estimation
model-based imputation
sparse variable selection