🤖 AI Summary
To address the vulnerability of CNNs to texture-based adversarial attacks and their overreliance on fragile texture features, this paper proposes the Defective CNN, introducing the novel defective convolutional layer: it embeds “defective neurons” with constant activation values into standard convolutions to actively suppress texture sensitivity and compel the model to learn more robust shape-based representations. Unlike conventional approaches, our method requires no adversarial training; instead, it achieves intrinsic robustness against black-box and transfer-based attacks through structured regularization. Evaluated on mainstream transfer-attack benchmarks, it achieves state-of-the-art performance at the time—significantly outperforming standard CNNs—while preserving inference efficiency and architectural compatibility. The core contribution is a paradigm shift: leveraging structural defects as a mechanism for inducing robustness. This offers a training-free, highly interpretable alternative for adversarial defense.
📝 Abstract
Robustness of convolutional neural networks (CNNs) has gained in importance on account of adversarial examples, i.e., inputs added as well-designed perturbations that are imperceptible to humans but can cause the model to predict incorrectly. Recent research suggests that the noises in adversarial examples break the textural structure, which eventually leads to wrong predictions. To mitigate the threat of such adversarial attacks, we propose defective convolutional networks that make predictions relying less on textural information but more on shape information by properly integrating defective convolutional layers into standard CNNs. The defective convolutional layers contain defective neurons whose activations are set to be a constant function. As defective neurons contain no information and are far different from standard neurons in its spatial neighborhood, the textural features cannot be accurately extracted, and so the model has to seek other features for classification, such as the shape. We show extensive evidence to justify our proposal and demonstrate that defective CNNs can defense against black-box attacks better than standard CNNs. In particular, they achieve state-of-the-art performance against transfer-based attacks without any adversarial training being applied.