🤖 AI Summary
This work addresses the dual challenges of black-box interpretability and robustness in deep neural networks (DNNs). To detect anomalies—including adversarial examples, out-of-distribution samples, and noisy inputs—the method identifies discriminative activation paths through deep layers, leveraging a novel adaptation of software-engineering-style path analysis to DNNs. Specifically, it employs genetic evolution to discover neuron activation paths most critical for target-class discrimination. To enhance both feature diversity and decision robustness, the approach integrates random subspace sampling with a multi-path voting ensemble. Extensive experiments across multiple benchmark models and datasets demonstrate that the method significantly improves detection accuracy and generalization under adversarial attacks, distributional shifts, and input noise—all within a unified framework. Crucially, it provides human-interpretable insights into model behavior while maintaining practical efficacy.
📝 Abstract
Deep neural networks (DNNs) are notoriously hard to understand and difficult to defend. Extracting representative paths (including the neuron activation values and the connections between neurons) from DNNs using software engineering approaches has recently shown to be a promising approach in interpreting the decision making process of blackbox DNNs, as the extracted paths are often effective in capturing essential features. With this in mind, this work investigates a novel approach that extracts critical paths from DNNs and subsequently applies the extracted paths for the anomaly detection task, based on the observation that outliers and adversarial inputs do not usually induce the same activation pattern on those paths as normal (in-distribution) inputs. In our approach, we first identify critical detection paths via genetic evolution and mutation. Since different paths in a DNN often capture different features for the same target class, we ensemble detection results from multiple paths by integrating random subspace sampling and a voting mechanism. Compared with state-of-the-art methods, our experimental results suggest that our method not only outperforms them, but it is also suitable for the detection of a broad range of anomaly types with high accuracy.