🤖 AI Summary
This work addresses the limited interpretability of neural network decisions in image classification by proposing a fine-grained explanation method based on Linear Min-Max Networks. The model, initialized to be equivalent to k-medoids clustering under the infinity norm, is trained via subgradient descent and yields predictions determined solely by the most active neuron, thereby inherently supporting decision traceability. Leveraging this property, the authors introduce a pixel vulnerability metric grounded in the activation of individual neurons to precisely identify the pixels most influential to classification outcomes. Experiments on the PneumoniaMNIST dataset demonstrate that the proposed approach significantly outperforms established baselines such as SHAP and Integrated Gradients in terms of explanation quality.
📝 Abstract
We investigate the explanability properties of the recently proposed linear-min-max neural networks. At initialization, they can be interpreted as k-medoids with the infinity norm as a distance. Then, they are trained using subgradient descent to better fit the data. The model has been shown to be a universal approximator. Yet, we can trace the decision process because a single most activated neuron is responsible for the value of the output. Using this property, we designed a pixel fragility measure that determines whether changes to a single pixel may be responsible to a change in the classification output. Experiments on the PneumoniaMnist dataset show that this explanation for the output of the neural network compares favorably to SHAP and Integrated Gradient.