🤖 AI Summary
This paper addresses the underexplored problem of heteroskedasticity testing in high-dimensional linear regression. We extend the Newey–Powell (1987) test to the high-dimensional regime for the first time. Under the asymptotic framework where $p/n o c in (0,infty)$, we rigorously derive the limiting distribution of the test statistic using Approximate Message Passing (AMP), thereby overcoming limitations of classical low-dimensional asymptotics. The proposed method is both computationally feasible and theoretically rigorous. Simulation studies demonstrate that the test maintains strong power and accurate size control in high dimensions. Empirical applications reveal that the “International Economic Growth” dataset satisfies the homoskedasticity assumption, whereas the “Supermarket Sales” dataset exhibits statistically significant heteroskedasticity. To our knowledge, this work provides the first heteroskedasticity test for high-dimensional regression with formal theoretical guarantees, enabling robust statistical inference in modern large-$p$, large-$n$ settings.
📝 Abstract
Homoscedastic regression error is a common assumption in many high-dimensional regression models and theories. Although heteroscedastic error commonly exists in real-world datasets, testing heteroscedasticity remains largely underexplored under high-dimensional settings. We consider the heteroscedasticity test proposed in Newey and Powell (1987), whose asymptotic theory has been well-established for the low-dimensional setting. We show that the Newey-Powell test can be developed for high-dimensional data. For asymptotic theory, we consider the setting where the number of dimensions grows with the sample size at a linear rate. The asymptotic analysis for the test statistic utilizes the Approximate Message Passing (AMP) algorithm, from which we obtain the limiting distribution of the test. The numerical performance of the test is investigated through an extensive simulation study. As real-data applications, we present the analysis based on"international economic growth"data (Belloni et al. 2011), which is found to be homoscedastic, and"supermarket"data (Lan et al., 2016), which is found to be heteroscedastic.