š¤ AI Summary
This paper addresses variable selection in high-dimensional nonparametric additive models under outliers and inherent sparsity. We propose a robust and sparse estimation method that integrates density power divergence (DPD) loss with nonconvex penalties (SCAD or MCP), coupled with B-spline basis expansions for flexible function estimation. Our key contribution is the first incorporation of DPD into sparse nonparametric additive modeling, enabling simultaneous robustness against outliers and structured sparsity recovery. Theoretically, we extend analysis to sub-Weibull error distributions and rigorously establish that the estimator achieves the optimal convergence rate even under heavy-tailed errors. Extensive simulations and real-data applications demonstrate that the proposed method significantly outperforms conventional Lā-based approaches: it maintains high true positive rates while substantially reducing false positives, thus delivering both statistical robustness and accurate sparse structure identification.
š Abstract
Additive models belong to the class of structured nonparametric regression models that do not suffer from the curse of dimensionality. Finding the additive components that are nonzero when the true model is assumed to be sparse is an important problem, and it is well studied in the literature. The majority of the existing methods focused on using the $L_2$ loss function, which is sensitive to outliers in the data. We propose a new variable selection method for additive models that is robust to outliers in the data. The proposed method employs a nonconcave penalty for variable selection and considers the framework of B-splines and density power divergence loss function for estimation. The loss function produces an M-estimator that down weights the effect outliers. Our asymptotic results are derived under the sub-Weibull assumption, which allows the error distribution to have an exponentially heavy tail. Under regularity conditions, we show that the proposed method achieves the optimal convergence rate. In addition, our results include the convergence rates for sub-Gaussian and sub-Exponential distributions as special cases. We numerically validate our theoretical findings using simulations and real data analysis.