SPPCSO: Adaptive Penalized Estimation Method for High-Dimensional Correlated Data

πŸ“… 2026-03-06
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the challenges posed by multicollinearity in high-dimensional correlated data, which often leads to unstable model estimation, degraded predictive performance, and failure in variable selection. To overcome these issues, the authors propose the Single-Parameter Principal Component Selection Operator (SPPCSO), which uniquely integrates principal component information into an adaptive L1 regularization framework. By leveraging single-parameter principal component regression to guide the adaptive adjustment of shrinkage factors, SPPCSO effectively balances variable selection and coefficient estimation. The method achieves selection consistency and a tighter estimation error bound under high-noise, high-dimensional settings, enabling robust identification of true signal variables while eliminating redundant ones. In applications to gene expression data analysis, SPPCSO successfully pinpoints disease-associated genes, demonstrating superior practical performance.

Technology Category

Application Category

πŸ“ Abstract
With the rise of high-dimensional correlated data, multicollinearity poses a significant challenge to model stability, often leading to unstable estimation and reduced predictive accuracy. This work proposes the Single-Parametric Principal Component Selection Operator (SPPCSO), an innovative penalized estimation method that integrates single-parametric principal component regression and $L_{1}$ regularization to adaptively adjust the shrinkage factor by incorporating principal component information. This approach achieves a balance between variable selection and coefficient estimation, ensuring model stability and robust estimation even in high-dimensional, high-noise environments. The primary contribution lies in addressing the instability of traditional variable selection methods when applied to high-noise, high-dimensional correlated data. Theoretically, our method exhibits selection consistency and achieves a smaller estimation error bound compared to traditional penalized estimation approaches. Extensive numerical experiments demonstrate that SPPCSO not only delivers stable and reliable estimation in high-noise settings but also accurately distinguishes signal variables from noise variables in group-effect structured data with highly correlated noise variables, effectively eliminating redundant variables and achieving more stable variable selection. Furthermore, SPPCSO successfully identifies disease-associated genes in gene expression data analysis, showcasing strong practical value. The results indicate that SPPCSO serves as an ideal tool for high-dimensional variable selection, offering an efficient and interpretable solution for modeling correlated data.
Problem

Research questions and friction points this paper is trying to address.

high-dimensional data
multicollinearity
variable selection
model stability
correlated data
Innovation

Methods, ideas, or system contributions that make the work stand out.

SPPCSO
penalized estimation
principal component regression
high-dimensional correlated data
variable selection
Ying Hu
Ying Hu
Professor of Mathematics, UniversitΓ© Rennes
stochastic analysiscontrol and optimizationmathematical finance
H
Hu Yang
College of Mathematics and Statistics, Chongqing University, Chongqing, 401331, China