Local MDI+: Local Feature Importances for Tree-Based Models

📅 2025-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Tree models suffer from insufficient local interpretability in high-stakes applications; existing perturbation-based methods (e.g., LIME, TreeSHAP) disregard model structure and exhibit poor stability, while global approaches (e.g., MDI+) fail to capture individual heterogeneity. Method: We propose LMDI+, the first sample-level feature importance method grounded in node-wise linear equivalence transformations of trees. It rigorously extends the MDI+ framework to the local setting, performing instance-level attribution solely based on internal tree structure. Leveraging theoretical equivalence between decision trees and piecewise linear models over node subspaces, LMDI+ integrates random forest structural analysis with a local weighting mechanism. Contribution/Results: LMDI+ enables counterfactual generation and homogeneous subgroup discovery. Evaluated on 12 real-world datasets, it improves downstream task performance by an average of 10% and achieves significantly higher feature ranking stability than LIME and TreeSHAP.

Technology Category

Application Category

📝 Abstract
Tree-based ensembles such as random forests remain the go-to for tabular data over deep learning models due to their prediction performance and computational efficiency. These advantages have led to their widespread deployment in high-stakes domains, where interpretability is essential for ensuring trustworthy predictions. This has motivated the development of popular local (i.e. sample-specific) feature importance (LFI) methods such as LIME and TreeSHAP. However, these approaches rely on approximations that ignore the model's internal structure and instead depend on potentially unstable perturbations. These issues are addressed in the global setting by MDI+, a feature importance method which exploits an equivalence between decision trees and linear models on a transformed node basis. However, the global MDI+ scores are not able to explain predictions when faced with heterogeneous individual characteristics. To address this gap, we propose Local MDI+ (LMDI+), a novel extension of the MDI+ framework to the sample specific setting. LMDI+ outperforms existing baselines LIME and TreeSHAP in identifying instance-specific signal features, averaging a 10% improvement in downstream task performance across twelve real-world benchmark datasets. It further demonstrates greater stability by consistently producing similar instance-level feature importance rankings across multiple random forest fits. Finally, LMDI+ enables local interpretability use cases, including the identification of closer counterfactuals and the discovery of homogeneous subgroups.
Problem

Research questions and friction points this paper is trying to address.

Improving local feature importance for tree-based models
Addressing instability in existing LFI methods like LIME
Enhancing interpretability for heterogeneous individual predictions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends MDI+ to sample-specific feature importance
Uses node basis equivalence in tree models
Improves stability and accuracy over LIME
🔎 Similar Papers
No similar papers found.