🤖 AI Summary
To address the computational intractability of conventional multivariate spatial autoregressive (MSAR) models under high-dimensional responses and covariates, this paper proposes the Factor-Enhanced Spatial Autoregressive (FSAR) model. FSAR employs diverse projections to consistently estimate low-dimensional latent factors, which are then incorporated as exogenous variables to enable component-wise independent modeling of high-dimensional responses. Variable selection is achieved via SCAD penalization coupled with a novel BIC-type criterion, ensuring both estimation consistency and valid inference for high-dimensional sparse structures. We establish the oracle property of the SCAD estimator—namely, consistency in both estimation and variable selection. Numerical experiments demonstrate that FSAR significantly outperforms existing methods in finite samples, achieving superior statistical accuracy and computational efficiency. The core innovation lies in the first integration of estimable latent factor structure into the high-dimensional MSAR framework, thereby mitigating the curse of dimensionality.
📝 Abstract
We study one particular type of multivariate spatial autoregression (MSAR) model with diverging dimensions in both responses and covariates. This makes the usual MSAR models no longer applicable due to the high computational cost. To address this issue, we propose a factor-augmented spatial autoregression (FSAR) model. FSAR is a special case of MSAR but with a novel factor structure imposed on the high-dimensional random error vector. The latent factors of FSAR are assumed to be of a fixed dimension. Therefore, they can be estimated consistently by the diversified projections method citep{fan2022learning}, as long as the dimension of the multivariate response is diverging. Once the fixed-dimensional latent factors are consistently estimated, they are then fed back into the original SAR model and serve as exogenous covariates. This leads to a novel FSAR model. Thereafter, different components of the high-dimensional response can be modeled separately. To handle the high-dimensional feature, a smoothly clipped absolute deviation (SCAD) type penalized estimator is developed for each response component. We show theoretically that the resulting SCAD estimator is uniformly selection consistent, as long as the tuning parameter is selected appropriately. For practical selection of the tuning parameter, a novel BIC method is developed. Extensive numerical studies are conducted to demonstrate the finite sample performance of the proposed method.