Feature Importance Depends on Properties of the Data: Towards Choosing the Correct Explanations for Your Data and Decision Trees based Models

📅 2025-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing local interpretability methods (e.g., LIME, SHAP, TreeInterpreter) yield unstable and unreliable feature importance estimates for decision tree models, with unclear fidelity under diverse data conditions. Method: We conduct a systematic robustness evaluation using both controlled synthetic and real-world binary classification datasets, quantifying effects of redundancy, nonlinearity, class imbalance, correlation, noise, and distribution skew via rigorous statistical significance testing. Contribution/Results: Our study provides the first empirical evidence that feature importance magnitude and sign are strongly governed by intrinsic data properties—not solely by model architecture or hyperparameters. We demonstrate that mainstream methods frequently produce contradictory rankings and degrade substantially under high feature redundancy or strong interactions. Based on these findings, we propose a novel “data-aware explanation method selection” paradigm and deliver the first data-driven, empirically grounded guideline for selecting trustworthy interpretability methods—advancing reliability and transparency in AI systems.

Technology Category

Application Category

📝 Abstract
In order to ensure the reliability of the explanations of machine learning models, it is crucial to establish their advantages and limits and in which case each of these methods outperform. However, the current understanding of when and how each method of explanation can be used is insufficient. To fill this gap, we perform a comprehensive empirical evaluation by synthesizing multiple datasets with the desired properties. Our main objective is to assess the quality of feature importance estimates provided by local explanation methods, which are used to explain predictions made by decision tree-based models. By analyzing the results obtained from synthetic datasets as well as publicly available binary classification datasets, we observe notable disparities in the magnitude and sign of the feature importance estimates generated by these methods. Moreover, we find that these estimates are sensitive to specific properties present in the data. Although some model hyper-parameters do not significantly influence feature importance assignment, it is important to recognize that each method of explanation has limitations in specific contexts. Our assessment highlights these limitations and provides valuable insight into the suitability and reliability of different explanatory methods in various scenarios.
Problem

Research questions and friction points this paper is trying to address.

Evaluate feature importance in decision trees
Assess reliability of local explanation methods
Identify data properties affecting explanation accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthesized datasets for evaluation
Assessed local explanation methods
Identified data properties' impact
🔎 Similar Papers
No similar papers found.
C
CE Ayad
LIX, Ecole polytechnique, IP Paris, Palaiseau, France
T
Thomas Bonnier
MRM, Société Générale, Paris, France
B
Benjamin Bosch
MRM, Société Générale, Paris, France
Sonali Parbhoo
Sonali Parbhoo
Assistant Professor, Imperial College London
Bayesian InferenceCausalityReinforcement LearningInterpretabilityHealthcare
Jesse Read
Jesse Read
École Polytechnique
Multi-label ClassificationData-Stream LearningMachine LearningArtificial IntelligenceData Science