🤖 AI Summary
Shapley values suffer from theoretical limitations and poor robustness in model interpretation. To address this, we depart from the conventional Shapley-centric paradigm and introduce, for the first time, a generalized feature attribution framework grounded in the Weber set and Harsanyi set—two fundamental solution concepts from cooperative game theory. We establish a “value function–aggregation rule” separation principle and develop a three-step, axiomatically consistent, and reproducible attribution methodology. The framework ensures both theoretical rigor and computational flexibility, markedly improving attribution stability under model architecture perturbations and task evolution. Unlike existing empirically driven XAI methods, our approach advances explainability research toward a principle-driven paradigm. It provides a novel, theoretically grounded pathway for integrating cooperative game theory into XAI, accompanied by an extensible toolkit of interpretable, mathematically justified attribution mechanisms.
📝 Abstract
Cooperative game theory has become a cornerstone of post-hoc interpretability in machine learning, largely through the use of Shapley values. Yet, despite their widespread adoption, Shapley-based methods often rest on axiomatic justifications whose relevance to feature attribution remains debatable. In this paper, we revisit cooperative game theory from an interpretability perspective and argue for a broader and more principled use of its tools. We highlight two general families of efficient allocations, the Weber and Harsanyi sets, that extend beyond Shapley values and offer richer interpretative flexibility. We present an accessible overview of these allocation schemes, clarify the distinction between value functions and aggregation rules, and introduce a three-step blueprint for constructing reliable and theoretically-grounded feature attributions. Our goal is to move beyond fixed axioms and provide the XAI community with a coherent framework to design attribution methods that are both meaningful and robust to shifting methodological trends.