🤖 AI Summary
This work proposes an efficient method for computing Shapley values in cooperative games with product-structured characteristic functions, commonly arising in interpretable machine learning models such as kernel methods and tree ensembles. By deriving the first exact one-dimensional integral representation of the Shapley value, the approach transforms the exponentially complex computation into a numerical integration task. High-accuracy approximations are achieved through Gauss–Legendre quadrature, numerically stable evaluation in log-space, and a parallelized correlated sampling algorithm. Theoretical analysis shows that the approximation error decays geometrically with the number of quadrature points, and experiments demonstrate that only a few hundred integration points suffice to yield the fastest and most numerically stable Shapley value estimates to date—even in settings with thousands of features.
📝 Abstract
We study the efficient computation of Shapley values for \emph{product games} -- cooperative games in which the coalition value factorizes as a product of per-player terms. Such games arise in machine learning explainability whenever the value function inherits a multiplicative structure from the underlying model, as in kernel methods with product kernels and tree-based models. Our key result is that the Shapley value of each player in a product game admits an exact one-dimensional integral representation: the weighted sum over exponentially many feature coalitions collapses to the integral of a degree-$(d-1)$ polynomial over $[0,1]$, where $d$ is the total number of features. This yields a Gauss--Legendre quadrature scheme that is \emph{provably exact} whenever the number of nodes satisfies $m_q \geq \lceil d/2 \rceil$, and otherwise provides a \emph{near-exact} approximation with error provably decaying geometrically in $m_q$. In practice, a few hundred nodes can achieve highly precise estimates even with thousands of features. Building on this formulation, we derive a numerically stable implementation via log-space evaluation, together with an efficient parallel implementation based on associative scan primitives that achieves $O(d\,m_q)$ total work and $O(\log d)$ parallel time. Experiments show that \textsc{QuadraSHAP} is the fastest numerically stable method across all tested configurations.