High Dimensional Sparse Canonical Correlation Analysis for Elliptical Symmetric Distributions

📅 2025-04-17

📈 Citations: 0

✨ Influential: 0

career value

247K/year

🤖 AI Summary

This paper addresses the failure of classical canonical correlation analysis (CCA) in high-dimensional heavy-tailed settings, where sample covariance matrices become unstable and yield inconsistent estimates. To resolve this, we propose a robust sparse CCA method. Its core innovation lies in the first-time coupling of the spatial sign covariance matrix with an ℓ₁-sparsity penalty, enabling consistent estimation with optimal convergence rates under elliptical symmetry. Theoretically, we establish the statistical optimality of the resulting estimator in high dimensions. Computationally, the method is tractable via efficient convex optimization. Extensive simulations and real-data experiments demonstrate that our approach significantly outperforms existing sparse CCA methods under heavy-tailed distributions, achieving both strong robustness and high estimation accuracy.

Technology Category

Application Category

📝 Abstract

This paper proposes a robust high-dimensional sparse canonical correlation analysis (CCA) method for investigating linear relationships between two high-dimensional random vectors, focusing on elliptical symmetric distributions. Traditional CCA methods, based on sample covariance matrices, struggle in high-dimensional settings, particularly when data exhibit heavy-tailed distributions. To address this, we introduce the spatial-sign covariance matrix as a robust estimator, combined with a sparsity-inducing penalty to efficiently estimate canonical correlations. Theoretical analysis shows that our method is consistent and robust under mild conditions, converging at an optimal rate even in the presence of heavy tails. Simulation studies demonstrate that our approach outperforms existing sparse CCA methods, particularly under heavy-tailed distributions. A real-world application further confirms the method's robustness and efficiency in practice. Our work provides a novel solution for high-dimensional canonical correlation analysis, offering significant advantages over traditional methods in terms of both stability and performance.

Problem

Research questions and friction points this paper is trying to address.

Robust high-dimensional sparse CCA for elliptical distributions

Overcoming limitations of traditional CCA with heavy-tailed data

Spatial-sign covariance matrix enhances estimation stability and performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Robust sparse CCA for elliptical distributions

Spatial-sign covariance for heavy-tailed data

Sparsity penalty for efficient correlation estimation

🔎 Similar Papers

Improving Numerical Stability of Normalized Mutual Information Estimator on High Dimensions