🤖 AI Summary
Distance covariance, though moderately robust, suffers from a zero breakdown point and an unbounded influence function, rendering it highly sensitive to outliers.
Method: This paper provides the first rigorous proof that its breakdown point is zero and characterizes its asymptotic sensitivity, exposing its fundamental robustness limitations. Building on this insight, we propose a novel data transformation framework integrating rank-based preprocessing and truncation, leading to the first robust distance correlation coefficient achieving both a positive breakdown point (>0) and high statistical power.
Contribution/Results: Theoretical analysis establishes that the proposed estimator possesses a bounded influence function and a substantially elevated breakdown point. Monte Carlo simulations demonstrate its superior testing power under multiple outliers. Empirical validation on genetic data confirms its robustness, interpretability, and practical superiority over classical methods, marking a significant advancement in robust dependence measurement.
📝 Abstract
Distance correlation is a popular measure of dependence between random variables. It has some robustness properties, but not all. We prove that the influence function of the usual distance correlation is bounded, but that its breakdown value is zero. Moreover, it has an unbounded sensitivity function, converging to the bounded influence function for increasing sample size. To address this sensitivity to outliers we construct a more robust version of distance correlation, which is based on a new data transformation. Simulations indicate that the resulting method is quite robust, and has good power in the presence of outliers. We illustrate the method on genetic data. Comparing the classical distance correlation with its more robust version provides additional insight.