🤖 AI Summary
This study addresses the limitations of traditional sparse principal component analysis (SPCA) in high-dimensional settings, where uniform penalization across all variables undermines interpretability and stability by failing to differentiate variable importance. To overcome this, the authors propose SP-SPCA, a novel approach that introduces a single adaptive balancing parameter within an L2-regularized framework to differentially modulate penalty strengths across variables, thereby flexibly trading off sparsity against explained variance. Coupled with an efficient optimization algorithm, SP-SPCA maintains computational efficiency while significantly enhancing feature selection accuracy, model stability, and result interpretability. Experiments on both synthetic and real-world datasets—including crime and financial market data—demonstrate that SP-SPCA more accurately recovers sparse loading structures, effectively eliminates noise variables, and retains higher cumulative variance with fewer selected variables, outperforming existing SPCA methods.
📝 Abstract
Sparse principal component analysis (SPCA) addresses the poor interpretability and variable redundancy often encountered by principal component analysis (PCA) in high-dimensional data. However, SPCA typically imposes uniform penalties on variables and does not account for differences in variable importance, which may lead to unstable performance in highly noisy or structurally complex settings. We propose SP-SPCA, a method that introduces a single equilibrium parameter into the regularization framework to adaptively adjust variable penalties. This modification of the L2 penalty provides flexible control over the trade-off between sparsity and explained variance while maintaining computational efficiency. Simulation studies show that the proposed method consistently outperforms standard sparse principal component methods in identifying sparse loading patterns, filtering noise variables, and preserving cumulative variance, especially in high-dimensional and noisy settings. Empirical applications to crime and financial market data further demonstrate its practical utility. In real data analyses, the method selects fewer but more relevant variables, thereby reducing model complexity while maintaining explanatory power. Overall, the proposed approach offers a robust and efficient alternative for sparse modeling in complex high-dimensional data, with clear advantages in stability, feature selection, and interpretability