🤖 AI Summary
This work addresses the interpretability challenge in binarized neural networks (BNNs) by modeling neuron activation thresholding as a Sugeno integral for the first time. This formulation yields an explicit set function representation for each neuron, along with an equivalent rule-based interpretation, and naturally extends to score computation at the output layer. By establishing a precise connection between BNNs and fuzzy measure theory, the proposed approach constructs an interpretability framework grounded in the Sugeno integral. This framework not only clearly characterizes the importance of input features and their interactions but also supports generalization to non-binary inputs and more complex interaction patterns, thereby providing both a theoretical foundation and practical tools for transparent analysis and application of BNNs.
📝 Abstract
In this article, we establish a precise connection between binarized neural networks (BNNs) and Sugeno integrals. The advantage of the Sugeno integral is that it provides a framework for representing the importance of inputs and their interactions, while being equivalent to a set of if-then rules. For a hidden BNN neuron at inference time, we show that the activation threshold test can be written as a Sugeno integral on binary inputs. This yields an explicit set-function representation of each neuron decision, and an associated rule-based representation. We also provide a Sugeno-integral expression for the last-layer score. Finally, we discuss how the same framework can be adapted to support richer input interactions and how it can be extended beyond the binary case induced by binarized neural networks.