🤖 AI Summary
This paper addresses the fundamental problem of testing the independent and identically distributed (IID) assumption in statistical modeling, particularly for random objects residing in general metric spaces. We propose a universal nonparametric test based on a novel off-diagonal sequential U-process as the test statistic. Our key theoretical contributions include: (i) establishing the first Gaussian approximation for the supremum of this process, accompanied by a non-asymptotic coupling error bound; and (ii) developing a jackknife multiplier bootstrap procedure for accurate inference without specifying alternative hypotheses. The method is sensitive to diverse violations of IID—such as temporal dependence, spatial correlation, and distributional drift—while requiring no parametric assumptions. Extensive simulations and real-data applications demonstrate superior detection power and broader applicability compared to existing approaches. Overall, the proposed framework provides a rigorous, robust, and broadly applicable theoretical tool for IID validation in complex, high-dimensional, and non-Euclidean data settings.
📝 Abstract
We propose a simple and intuitive test for arguably the most prevailing hypothesis in statistics that data are independent and identically distributed (IID), based on a newly introduced off-diagonal sequential U-process. This IID test is fully nonparametric and applicable to random objects in general spaces, while requiring no specific alternatives such as structural breaks or serial dependence, which allows for detecting general types of violations of the IID assumption. An easy-to-implement jackknife multiplier bootstrap is tailored to produce critical values of the test. Under mild conditions, we establish Gaussian approximation for the proposed U-processes, and derive non-asymptotic coupling and Kolmogorov distance bounds for its maximum and the bootstrapped version, providing rigorous theoretical guarantees. Simulations and real data applications are conducted to demonstrate the usefulness and versatility compared with existing methods.