🤖 AI Summary
This work addresses the out-of-manifold artifacts in Shapley value-based explanations caused by heuristic baselines in interpretable AI. The authors propose an axiomatic Aumann-Shapley attribution framework grounded in optimal generative flows, where the baseline path is defined as the Wasserstein-2 geodesic between the baseline and the input. By formulating baseline selection as a variational problem, they derive a unique gradient-based path integral representation that satisfies both efficiency and reparameterization invariance. Theoretical analysis establishes stability bounds on the advection approximation error, ensuring zero flow consistency error and strict manifold consistency. Experiments demonstrate that the proposed method significantly outperforms existing approaches in terms of semantic alignment and structure-aware total variation metrics.
📝 Abstract
Shapley-based attribution is critical for post-hoc XAI but suffers from off-manifold artifacts due to heuristic baselines. While generative methods attempt to address this, they often introduce geometric inefficiency and discretization drift. We propose a formal theory of on-manifold Aumann-Shapley attributions driven by optimal generative flows. We prove a representation theorem establishing the gradient line integral as the unique functional satisfying efficiency and geometric axioms, notably reparameterization invariance. To resolve path ambiguity, we select the kinetic-energy-minimizing Wasserstein-2 geodesic transporting a prior to the data distribution. This yields a canonical attribution family that recovers classical Shapley for additive models and admits provable stability bounds against flow approximation errors. By reframing baseline selection as a variational problem, our method experimentally outperforms baselines, achieving strict manifold adherence via vanishing Flow Consistency Error and superior semantic alignment characterized by Structure-Aware Total Variation. Our code is on https://github.com/cenweizhang/OTFlowSHAP.