NeuroShield: A Neuro-Symbolic Framework for Adversarial Robustness

📅 2026-01-19

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work proposes a lightweight neuro-symbolic approach to address the limitations of deep neural networks in safety-critical applications, where vulnerability to adversarial attacks and poor interpretability are major concerns. By effectively integrating symbolic logic rules—based on visual attributes such as shape and color—into the neural network training process, the method jointly optimizes semantic and symbolic logic losses. This integration enhances both robustness and interpretability without compromising clean-sample accuracy, requiring neither complex architectural modifications nor extensive data augmentation. Evaluated on the GTSRB dataset, the approach achieves adversarial accuracies under FGSM and PGD attacks that are 18.1% and 17.35% higher, respectively, than those of standard adversarial training baselines—yielding approximately threefold greater robustness gains compared to conventional methods.

Technology Category

Application Category

📝 Abstract

Adversarial vulnerability and lack of interpretability are critical limitations of deep neural networks, especially in safety-sensitive settings such as autonomous driving. We introduce \DesignII, a neuro-symbolic framework that integrates symbolic rule supervision into neural networks to enhance both adversarial robustness and explainability. Domain knowledge is encoded as logical constraints over appearance attributes such as shape and color, and enforced through semantic and symbolic logic losses applied during training. Using the GTSRB dataset, we evaluate robustness against FGSM and PGD attacks at a standard $\ell_\infty$ perturbation budget of $\varepsilon = 8/255$. Relative to clean training, standard adversarial training provides modest improvements in robustness ($\sim$10 percentage points). Conversely, our FGSM-Neuro-Symbolic and PGD-Neuro-Symbolic models achieve substantially larger gains, improving adversarial accuracy by 18.1\% and 17.35\% over their corresponding adversarial-training baselines, representing roughly a three-fold larger robustness gain than standard adversarial training provides when both are measured relative to the same clean-training baseline, without reducing clean-sample accuracy. Compared to transformer-based defenses such as LNL-MoEx, which require heavy architectures and extensive data augmentation, our PGD-Neuro-Symbolic variant attains comparable or superior robustness using a ResNet18 backbone trained for 10 epochs. These results show that symbolic reasoning offers an effective path to robust and interpretable AI.

Problem

Research questions and friction points this paper is trying to address.

adversarial robustness

interpretability

neuro-symbolic

deep neural networks

safety-critical systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

neuro-symbolic

adversarial robustness

symbolic rule supervision