Hidden Monotonicity: Explaining Deep Neural Networks via their DC Decomposition

📅 2026-01-12

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This work addresses the poor interpretability of deep neural networks and the limited expressiveness of directly enforced monotonic architectures by proposing a novel framework that decomposes a trained ReLU network via difference-of-convex (DC) decomposition into the difference of two monotonic convex functions. Building upon this decomposition, the authors introduce two new saliency methods—SplitCAM and SplitLRP—that enable intrinsic interpretability without modifying the original model architecture. Evaluated on ImageNet-S using VGG16 and ResNet18 backbones, the proposed approach consistently outperforms state-of-the-art methods across all Quantus evaluation metrics, demonstrating the effectiveness and superiority of DC decomposition–based self-explaining models in providing faithful and accurate explanations.

Technology Category

Application Category

📝 Abstract

It has been demonstrated in various contexts that monotonicity leads to better explainability in neural networks. However, not every function can be well approximated by a monotone neural network. We demonstrate that monotonicity can still be used in two ways to boost explainability. First, we use an adaptation of the decomposition of a trained ReLU network into two monotone and convex parts, thereby overcoming numerical obstacles from an inherent blowup of the weights in this procedure. Our proposed saliency methods - SplitCAM and SplitLRP - improve on state of the art results on both VGG16 and Resnet18 networks on ImageNet-S across all Quantus saliency metric categories. Second, we exhibit that training a model as the difference between two monotone neural networks results in a system with strong self-explainability properties.

Problem

Research questions and friction points this paper is trying to address.

monotonicity

explainability

neural networks

DC decomposition

saliency

Innovation

Methods, ideas, or system contributions that make the work stand out.

monotonicity

DC decomposition

saliency methods