🤖 AI Summary
Existing SHAP values suffer from theoretical deficiencies in explainable AI (XAI), violating causal consistency and explanation fidelity—thereby misleading decision-makers. Method: This work first systematically identifies the root cause: semantic inconsistency in the conventional SHAP formulation. We propose a novel, semantically rigorous, and causally grounded SHAP definition; reconstruct the Shapley value’s semantic foundation via cooperative game theory; design a hierarchical feature perturbation scheme; and integrate Monte Carlo variance reduction for efficient, high-precision estimation. Contribution/Results: Experiments demonstrate substantial improvements in attribution accuracy and alignment with human judgment. Our method explicitly exposes and corrects systematic biases of standard SHAP across critical scenarios—including model-agnostic benchmarks—establishing a more reliable, trustworthy foundation for XAI-based attribution.
📝 Abstract
Recent work demonstrated the existence of critical flaws in the current use of Shapley values in explainable AI (XAI), i.e. the so-called SHAP scores. These flaws are significant in that the scores provided to a human decision-maker can be misleading. Although these negative results might appear to indicate that Shapley values ought not be used in XAI, this paper argues otherwise. Concretely, this paper proposes a novel definition of SHAP scores that overcomes existing flaws. Furthermore, the paper outlines a practically efficient solution for the rigorous estimation of the novel SHAP scores. Preliminary experimental results confirm our claims, and further underscore the flaws of the current SHAP scores.