Proxy-Based Approximation of Shapley and Banzhaf Interactions

📅 2026-05-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

190K/year
🤖 AI Summary
Existing methods struggle to balance efficiency and accuracy when estimating high-order interaction effects. This work proposes ProxySHAP, a novel approach that leverages tree-based surrogate models for efficient sampling and incorporates a residual correction mechanism to consistently and accurately approximate both Shapley and Banzhaf interaction indices. ProxySHAP is the first method capable of exactly computing interaction indices for tree ensemble models in polynomial time. Theoretically, it characterizes the conditions under which the Maximum Sample Reuse (MSR) strategy simultaneously controls variance and corrects surrogate-induced bias. Empirical results demonstrate that ProxySHAP significantly outperforms ProxySPEX and KernelSHAP-IQ across both small and large computational budgets, achieving the lowest approximation error and delivering superior performance in downstream interpretability tasks.
📝 Abstract
Shapley and Banzhaf interactions capture the complex dynamics inherent in modern machine learning applications. However, current estimators for these higher-order interactions trade off between speed and accuracy. To overcome this limitation, we introduce ProxySHAP. ProxySHAP reconciles the high sample efficiency of tree-based proxy models with a principled path to consistency via residual correction. On a theoretical level, we derive a polynomial-time generalization of interventional TreeSHAP to compute exact interaction indices for tree ensembles, successfully bypassing exponential tree-depth dependencies in prior methods. Furthermore, we formally analyze the residual adjustment strategy, characterizing the specific conditions under which Maximum Sample Reuse (MSR) corrects proxy bias without its variance scaling exponentially with interaction size. Extensive benchmarking demonstrates that ProxySHAP sets a new state-of-the-art standard for approximation quality, including in large-scale applications with thousands of features. By achieving the lowest error in both small- and large-budget regimes, ProxySHAP significantly outperforms the prior best estimators ProxySPEX and KernelSHAP-IQ, while also delivering superior performance on downstream explainability tasks.
Problem

Research questions and friction points this paper is trying to address.

Shapley interaction
Banzhaf interaction
approximation
sample efficiency
high-order interactions
Innovation

Methods, ideas, or system contributions that make the work stand out.

ProxySHAP
Shapley interactions
Banzhaf interactions
TreeSHAP
residual correction
🔎 Similar Papers
No similar papers found.