Fast Calculation of Feature Contributions in Boosting Trees

📅 2024-07-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the computational intractability of feature-level R² attribution for tree models under quadratic loss, this paper proposes Q-SHAP—the first algorithm enabling exact Shapley value computation under quadratic loss in polynomial time. Methodologically, Q-SHAP integrates Shapley value theory, tree-structure-aware optimization, and dynamic programming with pruning to design a quadratic-loss-sensitive contribution allocation mechanism, thereby overcoming the fundamental limitation of classical Shapley methods—restricted applicability to linear loss functions. Its core contributions are threefold: (1) a globally decomposable, instance-wise R² attribution framework; (2) substantial computational speedup, enabling real-time analysis of large-scale boosting models; and (3) empirically validated accuracy gains—R² estimation error reduced by 12.6%–38.4% over state-of-the-art baselines in synthetic experiments.

Technology Category

Application Category

📝 Abstract
Recently, several fast algorithms have been proposed to decompose predicted value into Shapley values, enabling individualized feature contribution analysis in tree models. While such local decomposition offers valuable insights, it underscores the need for a global evaluation of feature contributions. Although coefficients of determination ($R^2$) allow for comparative assessment of individual features, individualizing $R^2$ is challenged by the underlying quadratic losses. To address this, we propose Q-SHAP, an efficient algorithm that reduces the computational complexity of calculating Shapley values for quadratic losses to polynomial time. Our simulations show that Q-SHAP not only improves computational efficiency but also enhances the accuracy of feature-specific $R^2$ estimates.
Problem

Research questions and friction points this paper is trying to address.

Efficiently calculate Shapley values in boosting trees
Global evaluation of feature contributions lacking
Individualized R-squared estimation with quadratic losses
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fast Shapley value decomposition in boosting trees
Polynomial-time Q-SHAP for quadratic losses
Accurate feature-specific R² estimation enhancement
🔎 Similar Papers
No similar papers found.