π€ AI Summary
This study addresses the limited robustness and interpretability of feature selection in medical prediction tasks by proposing a structured sparse feature selection method that integrates SHAP values with group L2,1 regularization. The approach first leverages tree-based models to compute SHAP feature importance scores, which are then aggregated into group-level importances. These group importances guide a group L2,1-regularized logistic regression to select a compact and non-redundant subset of features. To the best of our knowledge, this work is the first to combine SHAP-based attribution with group-sparse regularization, achieving enhanced stability, interpretability, and reduced redundancy in the selected features while maintaining competitive predictive performance.
π Abstract
Feature selection remains a major challenge in medical prediction, where existing approaches such as LASSO often lack robustness and interpretability. We introduce GRASP, a novel framework that couples Shapley value driven attribution with group $L_{21}$ regularization to extract compact and non-redundant feature sets. GRASP first distills group level importance scores from a pretrained tree model via SHAP, then enforces structured sparsity through group $L_{21}$ regularized logistic regression, yielding stable and interpretable selections. Extensive comparisons with LASSO, SHAP, and deep learning based methods show that GRASP consistently delivers comparable or superior predictive accuracy, while identifying fewer, less redundant, and more stable features.