A Stability Framework for Parameter Selection in the Minimum Covariance Determinant Problem

📅 2024-01-25
📈 Citations: 0
Influential: 0
📄 PDF

career value

217K/year
🤖 AI Summary
The Minimum Covariance Determinant (MCD) estimator suffers from empirical, theory-lacking selection of the subset size, rendering it sensitive to underlying inlier/outlier structures. Method: We propose a stability-driven paradigm for automatic parameter selection: (i) quantifying MCD estimation instability via bootstrap resampling; (ii) constructing an instability path to adaptively identify the optimal subset size; and (iii) integrating statistical depth–based initialization with concentration-step refinement to enhance robustness and interpretability. Contribution/Results: This work is the first to systematically incorporate stability analysis into MCD hyperparameter tuning—without requiring prior assumptions—to uncover intrinsic data structure. On multiple real-world datasets, our method achieves significantly higher anomaly detection accuracy and superior parameter robustness compared to classical MCD and state-of-the-art variants, while providing interpretable, data-driven justification for subset size selection.

Technology Category

Application Category

📝 Abstract
The Minimum Covariance Determinant (MCD) method is a widely adopted tool for robust estimation and outlier detection. In this paper, we introduce MCD model selection based on the notion of stability. Our best subset method leverages prior best practices such as statistical depths for initialization and concentration steps for subset refinement. Our contribution lies in constructing a bootstrap procedure to estimate the instability of the best subset algorithm. The instability path offers insights into a dataset's inlier/outlier structure and facilitates suitable choice of the subset size. We rigorously benchmark the proposed framework against existing MCD variants and illustrate its practical utility on several real-world datasets.
Problem

Research questions and friction points this paper is trying to address.

Develop stability-based model selection for MCD
Estimate algorithm instability via bootstrap procedure
Guide subset size choice using instability path
Innovation

Methods, ideas, or system contributions that make the work stand out.

Stability-based MCD model selection framework
Bootstrap procedure for instability estimation
Instability path for subset size selection