🤖 AI Summary
To address the high computational cost and reliance on the strong marginal correlation assumption in variable screening for ultrahigh-dimensional linear regression, this paper proposes a novel deterministic screening method based on ranking the absolute values of ridge partial correlation coefficients. It introduces ridge regularization into the definition of partial correlation for the first time, jointly modeling the ridge-regularized partial variance of predictors and the ridge regression coefficients of the response. This enables sure screening without requiring marginal correlation assumptions. Theoretically, the method is proven to possess asymptotic sure screening consistency. Computationally, it is simple and efficient. Extensive simulations and real-data analyses demonstrate that it significantly outperforms state-of-the-art methods—including SIS and DC-SIS—in both screening accuracy and stability. An open-source software package is provided to support reproducible ultrahigh-dimensional feature preselection.
📝 Abstract
Variable selection in ultrahigh-dimensional linear regression is challenging due to its high computational cost. Therefore, a screening step is usually conducted before variable selection to significantly reduce the dimension. Here we propose a novel and simple screening method based on ordering the absolute sample ridge partial correlations. The proposed method takes into account not only the ridge regularized estimates of the regression coefficients but also the ridge regularized partial variances of the predictor variables providing sure screening property without strong assumptions on the marginal correlations. Simulation study and a real data analysis show that the proposed method has a competitive performance compared with the existing screening procedures. A publicly available software implementing the proposed screening accompanies the article.