Online robust covariance matrix estimation and outlier detection

📅 2026-01-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of robust covariance estimation in online settings where data volume grows continuously and contamination rates increase, rendering traditional estimators vulnerable to bias and masking effects. The authors propose a novel method that simultaneously estimates the geometric median and the median-based covariance matrix in a streaming fashion—integrating these two robust statistics for the first time in an online framework. By computing Mahalanobis distances in real time using these estimates, the approach enables effective outlier detection while mitigating masking effects. The method maintains computational efficiency suitable for real-time applications and significantly enhances the robustness of covariance estimation. Experimental results on synthetic data demonstrate its accuracy in recovering true covariance structures and reliably identifying anomalies under contamination.

Technology Category

Application Category

📝 Abstract
Robust estimation of the covariance matrix and detection of outliers remain major challenges in statistical data analysis, particularly when the proportion of contaminated observations increases with the size of the dataset. Outliers can severely bias parameter estimates and induce a masking effect, whereby some outliers conceal the presence of other outliers, further complicating their detection. Although many approaches have been proposed for covariance estimation and outlier detection, to our knowledge, none of these methods have been implemented in an online setting. In this paper, we focus on online covariance matrix estimation and outlier detection. Specifically, we propose a new method for simultaneously and online estimating the geometric median and variance, which allows us to calculate the Mahalanobis distance for each incoming data point before deciding whether it should be considered an outlier. To mitigate the masking effect, robust estimation techniques for the mean and variance are required. Our approach uses the geometric median for robust estimation of the location and the median covariance matrix for robust estimation of the dispersion parameters. The new online methods proposed for parameter estimation and outlier detection allow real-time identification of outliers as data are observed sequentially. The performance of our methods is demonstrated on simulated datasets.
Problem

Research questions and friction points this paper is trying to address.

online robust estimation
covariance matrix
outlier detection
masking effect
geometric median
Innovation

Methods, ideas, or system contributions that make the work stand out.

online robust estimation
geometric median
median covariance matrix
Mahalanobis distance
outlier detection
🔎 Similar Papers
No similar papers found.
P
Paul Guillot
Sorbonne Université, Université Paris Cité, CNRS, Laboratoire de Probabilités, Statistique et Modélisation, LPSM, F-75005 Paris, France
Antoine Godichon-Baggioni
Antoine Godichon-Baggioni
Laboratoire de Probabilités, Statistique et Modélisation
S
Stéphane Robin
Sorbonne Université, Université Paris Cité, CNRS, Laboratoire de Probabilités, Statistique et Modélisation, LPSM, F-75005 Paris, France
L
Laure Sansonnet
Sorbonne Université, Université Paris Cité, CNRS, Laboratoire de Probabilités, Statistique et Modélisation, LPSM, F-75005 Paris, France