🤖 AI Summary
This paper addresses the challenging problem of high-dimensional multivariate independence testing. We propose a novel nonparametric test based on maximum distance correlation (MaxDCOR) and average distance correlation (AvgDCOR). The method unifies Euclidean distance and Gaussian kernel metrics, and—crucially—systematically constructs their corresponding test statistics in high dimensions for the first time. We rigorously establish statistical consistency and derive a fast chi-square approximation for the null distribution, thereby circumventing the high computational cost and limited asymptotic theory inherent in classical distance correlation. Experiments demonstrate that the proposed method achieves over 30% higher detection power than standard distance correlation under sparse strong dependence. It also significantly outperforms existing approaches in both synthetic multivariate dependency settings and real-world cancer–peptide plasma data, effectively capturing complex, high-order multivariate dependence structures.
📝 Abstract
This paper introduces and investigates the utilization of maximum and average distance correlations for multivariate independence testing. We characterize their consistency properties in high-dimensional settings with respect to the number of marginally dependent dimensions, assess the advantages of each test statistic, examine their respective null distributions, and present a fast chi-square-based testing procedure. The resulting tests are non-parametric and applicable to both Euclidean distance and the Gaussian kernel as the underlying metric. To better understand the practical use cases of the proposed tests, we evaluate the empirical performance of the maximum distance correlation, average distance correlation, and the original distance correlation across various multivariate dependence scenarios, as well as conduct a real data experiment to test the presence of various cancer types and peptide levels in human plasma.