🤖 AI Summary
This paper addresses the inconsistency and lack of mechanistic understanding in feature attribution methods (e.g., LIME, SHAP, Integrated Gradients) for black-box model explanations. We propose the first unified analytical framework integrating functional ANOVA (fANOVA) with cooperative game theory. Our framework rigorously couples fANOVA decomposition with Shapley values and higher-order interactions, enabling systematic characterization of both individual feature and feature-group contributions—locally and globally—as well as their interaction mechanisms. Through theoretical analysis and experiments on synthetic and real-world datasets, we quantitatively characterize intrinsic consistencies and discrepancies among mainstream methods in terms of distribution dependence, interaction sensitivity, and other key properties, thereby enabling cross-method comparability. The core contribution is a principled, unified analytical paradigm for interpretability methods, substantially enhancing their theoretical grounding and practical reliability.
📝 Abstract
Feature-based explanations, using perturbations or gradients, are a prevalent tool to understand decisions of black box machine learning models. Yet, differences between these methods still remain mostly unknown, which limits their applicability for practitioners. In this work, we introduce a unified framework for local and global feature-based explanations using two well-established concepts: functional ANOVA (fANOVA) from statistics, and the notion of value and interaction from cooperative game theory. We introduce three fANOVA decompositions that determine the influence of feature distributions, and use game-theoretic measures, such as the Shapley value and interactions, to specify the influence of higher-order interactions. Our framework combines these two dimensions to uncover similarities and differences between a wide range of explanation techniques for features and groups of features. We then empirically showcase the usefulness of our framework on synthetic and real-world datasets.