🤖 AI Summary
Existing feature attribution methods (e.g., KernelSHAP, LIME) rely on global data distributions, leading to inaccurate characterization of local model behavior and distorted explanations. To address this, we propose VARSHAP—a model-agnostic local feature attribution method that introduces prediction variance reduction as the core Shapley value metric, the first such formulation. VARSHAP rigorously satisfies the efficiency, symmetry, and additivity axioms of Shapley values. It estimates conditional variances via Monte Carlo sampling, eliminating the need for surrogate models or distributional assumptions, and inherently exhibits robustness to data distribution shifts. Experiments on synthetic and real-world datasets demonstrate that VARSHAP improves attribution accuracy by 12–23% over KernelSHAP and LIME. Qualitative evaluations confirm its superior alignment with local decision logic, significantly mitigating the local explanation bias induced by global distribution dependence.
📝 Abstract
Existing feature attribution methods like SHAP often suffer from global dependence, failing to capture true local model behavior. This paper introduces VARSHAP, a novel model-agnostic local feature attribution method which uses the reduction of prediction variance as the key importance metric of features. Building upon Shapley value framework, VARSHAP satisfies the key Shapley axioms, but, unlike SHAP, is resilient to global data distribution shifts. Experiments on synthetic and real-world datasets demonstrate that VARSHAP outperforms popular methods such as KernelSHAP or LIME, both quantitatively and qualitatively.