Unifying Feature-Based Explanations with Functional ANOVA and Cooperative Game Theory

📅 2024-12-22

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This paper addresses the inconsistency and lack of mechanistic understanding in feature attribution methods (e.g., LIME, SHAP, Integrated Gradients) for black-box model explanations. We propose the first unified analytical framework integrating functional ANOVA (fANOVA) with cooperative game theory. Our framework rigorously couples fANOVA decomposition with Shapley values and higher-order interactions, enabling systematic characterization of both individual feature and feature-group contributions—locally and globally—as well as their interaction mechanisms. Through theoretical analysis and experiments on synthetic and real-world datasets, we quantitatively characterize intrinsic consistencies and discrepancies among mainstream methods in terms of distribution dependence, interaction sensitivity, and other key properties, thereby enabling cross-method comparability. The core contribution is a principled, unified analytical paradigm for interpretability methods, substantially enhancing their theoretical grounding and practical reliability.

Technology Category

Application Category

📝 Abstract

Feature-based explanations, using perturbations or gradients, are a prevalent tool to understand decisions of black box machine learning models. Yet, differences between these methods still remain mostly unknown, which limits their applicability for practitioners. In this work, we introduce a unified framework for local and global feature-based explanations using two well-established concepts: functional ANOVA (fANOVA) from statistics, and the notion of value and interaction from cooperative game theory. We introduce three fANOVA decompositions that determine the influence of feature distributions, and use game-theoretic measures, such as the Shapley value and interactions, to specify the influence of higher-order interactions. Our framework combines these two dimensions to uncover similarities and differences between a wide range of explanation techniques for features and groups of features. We then empirically showcase the usefulness of our framework on synthetic and real-world datasets.

Problem

Research questions and friction points this paper is trying to address.

Unify feature-based explanations using fANOVA and game theory

Compare differences between explanation methods for black box models

Analyze feature influence and interactions in machine learning models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unifies feature explanations with fANOVA and game theory

Uses Shapley value for higher-order interaction analysis

Combines local and global explanation techniques

🔎 Similar Papers

The FIX Benchmark: Extracting Features Interpretable to eXperts