A consistent test of spherical symmetry for multivariate and high-dimensional data via data augmentation

📅 2024-03-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the long-standing open problem of testing spherical symmetry for multivariate distributions in high-dimensional settings ($d > n$), proposing the first consistent, nonparametric test that does not require prespecifying the center of symmetry. Methodologically, it introduces: (1) a necessary and sufficient nonnegative measure $zeta(P)$ quantifying spherical symmetry; (2) a consistent estimator built upon data augmentation; and (3) an original resampling-based calibration algorithm that rigorously controls Type-I error while ensuring asymptotic power. Theoretically, the test is strongly consistent even when $d gg n$, achieves the minimax optimal convergence rate, and possesses Pitman efficiency. Moreover, it reveals a fine-grained phase transition in asymptotic power with respect to contamination proportion $delta_n$. Empirical studies demonstrate its substantial superiority over existing methods under both sparse and dense high-dimensional regimes.

Technology Category

Application Category

📝 Abstract
We develop a test for spherical symmetry of a multivariate distribution $P$ that works even when the dimension of the data $d$ is larger than the sample size $n$. We propose a non-negative measure $zeta(P)$ such that $zeta(P)=0$ if and only if $P$ is spherically symmetric. We construct a consistent estimator of $zeta(P)$ using the data augmentation method and investigate its large sample properties. The proposed test based on this estimator is calibrated using a novel resampling algorithm. Our test controls the Type-I error, and it is consistent against general alternatives. We also study its behaviour for a sequence of alternatives $(1-delta_n) F+delta_n G$, where $zeta(G)=0$ but $zeta(F)>0$, and $delta_n in [0,1]$. When $limsupdelta_n<1$, for any $G$, the power of our test converges to unity as $n$ increases. However, if $limsupdelta_n=1$, the asymptotic power of our test depends on $lim n(1-delta_n)^2$. We establish this by proving the minimax rate optimality of our test over a suitable class of alternatives and showing that it is Pitman efficient when $lim n(1-delta_n)^2>0$. Moreover, our test is provably consistent for high-dimensional data even when $d$ is larger than $n$. Our numerical results amply demonstrate the superiority of the proposed test over some state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Testing spherical symmetry in high-dimensional data
Developing a nonparametric measure for distribution asymmetry
Ensuring test consistency when dimension exceeds sample size
Innovation

Methods, ideas, or system contributions that make the work stand out.

Nonparametric test for spherical symmetry in high dimensions
Data augmentation method for consistent asymmetry estimation
Novel resampling algorithm for test calibration
B
Bilol Banerjee
Theoretical Statistics and Mathematics Unit, Indian Statistical Institute, Kolkata
Anil K. Ghosh
Anil K. Ghosh
Indian Statistical Institute, Kolkata
Robust & nonparametric statisticsStatistical learningInference for high