CID: Measuring Feature Importance Through Counterfactual Distributions

📅 2025-11-19

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Existing feature importance methods suffer from the lack of ground-truth benchmarks and insufficient theoretical foundations. To address this, we propose a local feature importance measure grounded in counterfactual distributional divergence. Our method generates positive and negative counterfactual samples, models their conditional distributions via kernel density estimation, and quantifies each feature’s perturbation effect on the decision boundary using distributional distances—e.g., the Wasserstein distance—thereby ranking feature importance. This is the first work to rigorously axiomatize counterfactual distributional divergence as a principled basis for importance, satisfying formal axioms including sensitivity and consistency. Extensive experiments across multiple benchmark datasets demonstrate that our approach significantly outperforms baselines such as SHAP and LIME in fidelity metrics—including sufficiency and necessity—yielding more accurate, interpretable, and theoretically sound model explanations.

Technology Category

Application Category

📝 Abstract

Assessing the importance of individual features in Machine Learning is critical to understand the model's decision-making process. While numerous methods exist, the lack of a definitive ground truth for comparison highlights the need for alternative, well-founded measures. This paper introduces a novel post-hoc local feature importance method called Counterfactual Importance Distribution (CID). We generate two sets of positive and negative counterfactuals, model their distributions using Kernel Density Estimation, and rank features based on a distributional dissimilarity measure. This measure, grounded in a rigorous mathematical framework, satisfies key properties required to function as a valid metric. We showcase the effectiveness of our method by comparing with well-established local feature importance explainers. Our method not only offers complementary perspectives to existing approaches, but also improves performance on faithfulness metrics (both for comprehensiveness and sufficiency), resulting in more faithful explanations of the system. These results highlight its potential as a valuable tool for model analysis.

Problem

Research questions and friction points this paper is trying to address.

Measures feature importance through counterfactual distributions in ML models

Introduces CID method using distributional dissimilarity for feature ranking

Improves faithfulness metrics for more accurate model explanations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates positive and negative counterfactuals for analysis

Models distributions using Kernel Density Estimation

Ranks features through distributional dissimilarity measures

🔎 Similar Papers

No similar papers found.