Highly robust factored principal component analysis for matrix-valued outlier accommodation and explainable detection via matrix minimum covariance determinant

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

Traditional PCA fails for matrix-variate data under heavy-tailed distributions or contamination by outliers—due to structural distortion from vectorization and inherent sensitivity to anomalies. To address this, we propose High-Robustness Factor Principal Component Analysis (HRFPCA). Methodologically, HRFPCA is the first to embed the Matrix Minimum Covariance Determinant (MMCD) estimator into a factor PCA framework, jointly modeling row- and column-wise covariances under the matrix normal distribution—replacing the contamination-prone maximum likelihood estimator (MLE). It further integrates Score-Orthogonal Distance Analysis (SODA) for interpretable outlier localization and classification. HRFPCA achieves a breakdown point of nearly 50%, strictly preserves intrinsic two-dimensional matrix structure, and balances strong robustness with computational efficiency. Experiments on both synthetic and real-world datasets demonstrate that HRFPCA significantly outperforms state-of-the-art methods in outlier detection accuracy, robustness to contamination, and generalization capability.

Technology Category

Application Category

📝 Abstract

Principal component analysis (PCA) is a classical and widely used method for dimensionality reduction, with applications in data compression, computer vision, pattern recognition, and signal processing. However, PCA is designed for vector-valued data and encounters two major challenges when applied to matrix-valued data with heavy-tailed distributions or outliers: (1) vectorization disrupts the intrinsic matrix structure, leading to information loss and the curse of dimensionality, and (2) PCA is highly sensitive to outliers. Factored PCA (FPCA) addresses the first issue through probabilistic modeling, using a matrix normal distribution that explicitly represents row and column covariances via a separable covariance structure, thereby preserving the two-way dependency and matrix form of the data. Building on FPCA, we propose highly robust FPCA (HRFPCA), a robust extension that replaces maximum likelihood estimators with the matrix minimum covariance determinant (MMCD) estimators. This modification enables HRFPCA to retain FPCA's ability to model matrix-valued data while achieving a breakdown point close to 50%, substantially improving resistance to outliers. Furthermore, HRFPCA produces the score--orthogonal distance analysis (SODA) plot, which effectively visualizes and classifies matrix-valued outliers. Extensive simulations and real-data analyses demonstrate that HRFPCA consistently outperforms competing methods in robustness and outlier detection, underscoring its effectiveness and broad applicability.

Problem

Research questions and friction points this paper is trying to address.

Addresses PCA's vulnerability to outliers in matrix-valued data

Preserves matrix structure while achieving high breakdown robustness

Provides explainable outlier detection via SODA plot visualization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Robust factored PCA using matrix minimum covariance determinant

Matrix-valued outlier accommodation with high breakdown point

Explainable outlier detection via score-orthogonal distance analysis

🔎 Similar Papers

GSVD-NMF: Recovering Missing Features in Non-negative Matrix Factorization