🤖 AI Summary
This work addresses the interpretability of predictive models—such as binary neural networks and Boolean networks—under multivariate Bernoulli inputs. Method: We propose an L² oblique projection analysis framework grounded in Hoeffding decomposition. Theoretically, we establish, for the first time, the explicit structure of Hoeffding decomposition under Bernoulli distributions: all higher-order interaction terms are orthogonal to the one-dimensional main subspace, enabling exact reverse engineering and closed-form solutions. This structure permits explicit derivation of global sensitivity metrics, including Sobol’ indices and Shapley effects. Computationally, the framework integrates Hoeffding decomposition, L² oblique projection, and variance attribution theory. Results: Numerical experiments demonstrate its effectiveness and scalability in high-dimensional, sparse binary input settings. To our knowledge, this is the first unified framework for model interpretation under discrete, finite-support inputs that simultaneously ensures theoretical rigor and computational feasibility.
📝 Abstract
Explaining the behavior of predictive models with random inputs can be achieved through sub-models decomposition, where such sub-models have easier interpretable features. Arising from the uncertainty quantification community, recent results have demonstrated the existence and uniqueness of a generalized Hoeffding decomposition for such predictive models when the stochastic input variables are correlated, based on concepts of oblique projection onto L 2 subspaces. This article focuses on the case where the input variables have Bernoulli distributions and provides a complete description of this decomposition. We show that in this case the underlying L 2 subspaces are one-dimensional and that the functional decomposition is explicit. This leads to a complete interpretability framework and theoretically allows reverse engineering. Explicit indicators of the influence of inputs on the output prediction (exemplified by Sobol' indices and Shapley effects) can be explicitly derived. Illustrated by numerical experiments, this type of analysis proves useful for addressing decision-support problems, based on binary decision diagrams, Boolean networks or binary neural networks. The article outlines perspectives for exploring high-dimensional settings and, beyond the case of binary inputs, extending these findings to models with finite countable inputs.