Uniform-over-dimension location tests for multivariate and high-dimensional data

📅 2025-12-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing two-sample location tests for high-dimensional data rely on restrictive assumptions—such as fixed dimensionality or specific dimension-to-sample-size asymptotics (e.g., (p/n o ext{const}))—whose validity is often unverifiable in practice. Method: We propose a dimension-uniform asymptotic test, grounded in a novel uniform-over-dimension central limit theorem, integrating functional CLT and adaptive standardization to construct a test statistic independent of the relative growth rate of (p) and (n). Contribution/Results: The proposed test rigorously controls Type I error uniformly across all dimensions and retains asymptotic power. Simulation studies and real-data analyses demonstrate its substantial superiority over Hotelling’s (T^2) and state-of-the-art high-dimensional tests—particularly in small-sample, high-dimensional settings—thereby offering a robust, assumption-lean alternative for modern multivariate inference.

Technology Category

Application Category

📝 Abstract
Asymptotic methods for hypothesis testing in high-dimensional data usually require the dimension of the observations to increase to infinity, often with an additional relationship between the dimension (say, $p$) and the sample size (say, $n$). On the other hand, multivariate asymptotic testing methods are valid for fixed dimension only and their implementations typically require the sample size to be large compared to the dimension to yield desirable results. In practical scenarios, it is usually not possible to determine whether the dimension of the data conform to the conditions required for the validity of the high-dimensional asymptotic methods for hypothesis testing, or whether the sample size is large enough compared to the dimension of the data. In this work, we first describe the notion of uniform-over-$p$ convergences and subsequently, develop a uniform-over-dimension central limit theorem. An asymptotic test for the two-sample equality of locations is developed, which now holds uniformly over the dimension of the observations. Using simulated and real data, it is demonstrated that the proposed test exhibits better performance compared to several popular tests in the literature for high-dimensional data as well as the usual scaled two-sample tests for multivariate data, including the Hotelling's $T^2$ test for multivariate Gaussian data.
Problem

Research questions and friction points this paper is trying to address.

Develops a uniform-over-dimension central limit theorem for hypothesis testing.
Creates a two-sample location test valid across varying data dimensions.
Addresses limitations of existing multivariate and high-dimensional asymptotic tests.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uniform-over-dimension central limit theorem development
Two-sample location test valid across all dimensions
Improved performance over existing high-dimensional and multivariate tests
🔎 Similar Papers
No similar papers found.
R
Ritabrata Karmakar
Indian Statistical Institute
J
Joydeep Chowdhury
King Abdullah University of Science and Technology
Subhajit Dutta
Subhajit Dutta
ISI Kolkata (on leave from IIT Kanpur)
Statistical Pattern RecognitionMachine Learning
M
Marc G. Genton
King Abdullah University of Science and Technology