Robust Variable Selection in High-dimensional Nonparametric Additive Model

📅 2025-05-06

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This paper addresses variable selection in high-dimensional nonparametric additive models under outliers and inherent sparsity. We propose a robust and sparse estimation method that integrates density power divergence (DPD) loss with nonconvex penalties (SCAD or MCP), coupled with B-spline basis expansions for flexible function estimation. Our key contribution is the first incorporation of DPD into sparse nonparametric additive modeling, enabling simultaneous robustness against outliers and structured sparsity recovery. Theoretically, we extend analysis to sub-Weibull error distributions and rigorously establish that the estimator achieves the optimal convergence rate even under heavy-tailed errors. Extensive simulations and real-data applications demonstrate that the proposed method significantly outperforms conventional L₂-based approaches: it maintains high true positive rates while substantially reducing false positives, thus delivering both statistical robustness and accurate sparse structure identification.

Technology Category

Application Category

📝 Abstract

Additive models belong to the class of structured nonparametric regression models that do not suffer from the curse of dimensionality. Finding the additive components that are nonzero when the true model is assumed to be sparse is an important problem, and it is well studied in the literature. The majority of the existing methods focused on using the $L_2$ loss function, which is sensitive to outliers in the data. We propose a new variable selection method for additive models that is robust to outliers in the data. The proposed method employs a nonconcave penalty for variable selection and considers the framework of B-splines and density power divergence loss function for estimation. The loss function produces an M-estimator that down weights the effect outliers. Our asymptotic results are derived under the sub-Weibull assumption, which allows the error distribution to have an exponentially heavy tail. Under regularity conditions, we show that the proposed method achieves the optimal convergence rate. In addition, our results include the convergence rates for sub-Gaussian and sub-Exponential distributions as special cases. We numerically validate our theoretical findings using simulations and real data analysis.

Problem

Research questions and friction points this paper is trying to address.

Robust variable selection in high-dimensional additive models

Handling outliers with density power divergence loss

Achieving optimal convergence under sub-Weibull error distributions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Robust variable selection with nonconcave penalty

Uses B-splines and density power divergence

Optimal convergence under sub-Weibull assumption

🔎 Similar Papers

No similar papers found.