Rigorous Feature Importance Scores based on Shapley Value and Banzhaf Index

📅 2025-08-16

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Existing weak abductive explanation (WAXp) methods compute feature importance solely from the WAXp set, neglecting critical information embedded in the complementary (non-WAXp) set—particularly its relationships with formal explanations (XPs) and adversarial examples (AExs)—leading to incomplete attribution. This work is the first to systematically incorporate the non-WAXp set into feature importance modeling. We propose a novel game-theoretic scoring framework that unifies Shapley values and Banzhaf indices, and establish a theoretical link between XPs and AExs to quantify each feature’s robustness contribution toward excluding adversarial perturbations. The resulting method ensures both theoretical rigor and computational interpretability. Empirically, it significantly enhances the completeness and security of feature attribution in high-stakes settings, providing a more reliable foundational metric for explainable AI. (136 words)

Technology Category

Application Category

📝 Abstract

Feature attribution methods based on game theory are ubiquitous in the field of eXplainable Artificial Intelligence (XAI). Recent works proposed rigorous feature attribution using logic-based explanations, specifically targeting high-stakes uses of machine learning (ML) models. Typically, such works exploit weak abductive explanation (WAXp) as the characteristic function to assign importance to features. However, one possible downside is that the contribution of non-WAXp sets is neglected. In fact, non-WAXp sets can also convey important information, because of the relationship between formal explanations (XPs) and adversarial examples (AExs). Accordingly, this paper leverages Shapley value and Banzhaf index to devise two novel feature importance scores. We take into account non-WAXp sets when computing feature contribution, and the novel scores quantify how effective each feature is at excluding AExs. Furthermore, the paper identifies properties and studies the computational complexity of the proposed scores.

Problem

Research questions and friction points this paper is trying to address.

Develops rigorous feature importance scores using Shapley and Banzhaf

Incorporates non-WAXp sets to improve feature attribution accuracy

Quantifies feature effectiveness in excluding adversarial examples

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages Shapley value for feature importance

Utilizes Banzhaf index in scoring features

Incorporates non-WAXp sets in contributions

🔎 Similar Papers

Improving the Weighting Strategy in KernelSHAP