🤖 AI Summary
To address the limited interpretability of logical nodes in DiffLogic networks, this paper proposes eXpLogic: a circuit-inspired method that constructs input-pattern sensitivity graphs to enable node-level logical attribution. It introduces the SwitchDist metric to quantify the discriminative capability of explanation graphs on class-flipping inputs. The framework supports misclassification attribution, discovery of shared activation patterns, and class-specific model compression. Experiments demonstrate that eXpLogic outperforms Vanilla Gradients and Integrated Gradients in explanation fidelity. Under controlled degradation—only 3.8% accuracy loss—it achieves 87% reduction in network size and 8% acceleration in inference speed. This work is the first to integrate logic-gate-level abstraction modeling with gradient-based sensitivity analysis, advancing structured, verifiable, and interpretable deep neural network decision-making research.
📝 Abstract
Constraining deep neural networks (DNNs) to learn individual logic types per node, as performed using the DiffLogic network architecture, opens the door to model-specific explanation techniques that quell the complexity inherent to DNNs. Inspired by principles of circuit analysis from computer engineering, this work presents an algorithm (eXpLogic) for producing saliency maps which explain input patterns that activate certain functions. The eXpLogic explanations: (1) show the exact set of inputs responsible for a decision, which helps interpret false negative and false positive predictions, (2) highlight common input patterns that activate certain outputs, and (3) help reduce the network size to improve class-specific inference. To evaluate the eXpLogic saliency map, we introduce a metric that quantifies how much an input changes before switching a model's class prediction (the SwitchDist) and use this metric to compare eXpLogic against the Vanilla Gradients (VG) and Integrated Gradient (IG) methods. Generally, we show that eXpLogic saliency maps are better at predicting which inputs will change the class score. These maps help reduce the network size and inference times by 87% and 8%, respectively, while having a limited impact (-3.8%) on class-specific predictions. The broader value of this work to machine learning is in demonstrating how certain DNN architectures promote explainability, which is relevant to healthcare, defense, and law.