Fast Calculation of Feature Contributions in Boosting Trees

📅 2024-07-03

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

To address the computational intractability of feature-level R² attribution for tree models under quadratic loss, this paper proposes Q-SHAP—the first algorithm enabling exact Shapley value computation under quadratic loss in polynomial time. Methodologically, Q-SHAP integrates Shapley value theory, tree-structure-aware optimization, and dynamic programming with pruning to design a quadratic-loss-sensitive contribution allocation mechanism, thereby overcoming the fundamental limitation of classical Shapley methods—restricted applicability to linear loss functions. Its core contributions are threefold: (1) a globally decomposable, instance-wise R² attribution framework; (2) substantial computational speedup, enabling real-time analysis of large-scale boosting models; and (3) empirically validated accuracy gains—R² estimation error reduced by 12.6%–38.4% over state-of-the-art baselines in synthetic experiments.

Technology Category

Application Category

📝 Abstract

Recently, several fast algorithms have been proposed to decompose predicted value into Shapley values, enabling individualized feature contribution analysis in tree models. While such local decomposition offers valuable insights, it underscores the need for a global evaluation of feature contributions. Although coefficients of determination ($R^2$) allow for comparative assessment of individual features, individualizing $R^2$ is challenged by the underlying quadratic losses. To address this, we propose Q-SHAP, an efficient algorithm that reduces the computational complexity of calculating Shapley values for quadratic losses to polynomial time. Our simulations show that Q-SHAP not only improves computational efficiency but also enhances the accuracy of feature-specific $R^2$ estimates.

Problem

Research questions and friction points this paper is trying to address.

Efficiently calculate Shapley values in boosting trees

Global evaluation of feature contributions lacking

Individualized R-squared estimation with quadratic losses

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fast Shapley value decomposition in boosting trees

Polynomial-time Q-SHAP for quadratic losses

Accurate feature-specific R² estimation enhancement

🔎 Similar Papers

Breast Cancer Classification Using Gradient Boosting Algorithms Focusing on Reducing the False Negative and SHAP for Explainability