🤖 AI Summary
This paper addresses the failure of classical canonical correlation analysis (CCA) in high-dimensional heavy-tailed settings, where sample covariance matrices become unstable and yield inconsistent estimates. To resolve this, we propose a robust sparse CCA method. Its core innovation lies in the first-time coupling of the spatial sign covariance matrix with an ℓ₁-sparsity penalty, enabling consistent estimation with optimal convergence rates under elliptical symmetry. Theoretically, we establish the statistical optimality of the resulting estimator in high dimensions. Computationally, the method is tractable via efficient convex optimization. Extensive simulations and real-data experiments demonstrate that our approach significantly outperforms existing sparse CCA methods under heavy-tailed distributions, achieving both strong robustness and high estimation accuracy.
📝 Abstract
This paper proposes a robust high-dimensional sparse canonical correlation analysis (CCA) method for investigating linear relationships between two high-dimensional random vectors, focusing on elliptical symmetric distributions. Traditional CCA methods, based on sample covariance matrices, struggle in high-dimensional settings, particularly when data exhibit heavy-tailed distributions. To address this, we introduce the spatial-sign covariance matrix as a robust estimator, combined with a sparsity-inducing penalty to efficiently estimate canonical correlations. Theoretical analysis shows that our method is consistent and robust under mild conditions, converging at an optimal rate even in the presence of heavy tails. Simulation studies demonstrate that our approach outperforms existing sparse CCA methods, particularly under heavy-tailed distributions. A real-world application further confirms the method's robustness and efficiency in practice. Our work provides a novel solution for high-dimensional canonical correlation analysis, offering significant advantages over traditional methods in terms of both stability and performance.