🤖 AI Summary
This paper addresses three key challenges in Shapley value computation: high computational complexity, lack of uncertainty quantification, and reliance on data distribution assumptions for estimating marginal contributions. To this end, we propose the Variational Shapley Network (VSN), a novel architecture that yields probabilistic feature attributions via a single forward pass. Our core contribution is the first formulation of Shapley values as random variables, enabled by feature-specific latent spaces and learnable baselines—thereby eliminating exhaustive subset enumeration and assumptions about the underlying data distribution. By integrating masked neural networks with variational inference, VSN performs end-to-end variational approximation of Shapley values. Extensive experiments on synthetic and real-world datasets demonstrate VSN’s efficiency (O(1) forward pass), high attribution fidelity, and well-calibrated uncertainty estimates—substantially outperforming conventional sampling-based methods and surrogate models.
📝 Abstract
Shapley values have emerged as a foundational tool in machine learning (ML) for elucidating model decision-making processes. Despite their widespread adoption and unique ability to satisfy essential explainability axioms, computational challenges persist in their estimation when ($i$) evaluating a model over all possible subset of input feature combinations, ($ii$) estimating model marginals, and ($iii$) addressing variability in explanations. We introduce a novel, self-explaining method that simplifies the computation of Shapley values significantly, requiring only a single forward pass. Recognizing the deterministic treatment of Shapley values as a limitation, we explore incorporating a probabilistic framework to capture the inherent uncertainty in explanations. Unlike alternatives, our technique does not rely directly on the observed data space to estimate marginals; instead, it uses adaptable baseline values derived from a latent, feature-specific embedding space, generated by a novel masked neural network architecture. Evaluations on simulated and real datasets underscore our technique's robust predictive and explanatory performance.