DeepDefense: Layer-Wise Gradient-Feature Alignment for Building Robust Neural Networks

📅 2025-11-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deep neural networks suffer from poor robustness against imperceptible adversarial perturbations. To address this, we propose Gradient-Feature Alignment (GFA), a novel defense framework that, for the first time, interprets adversarial attacks through radial/tangential decomposition of input gradients. GFA enforces cross-layer alignment between gradients and features to suppress abrupt loss variations along the tangential direction, thereby smoothing the loss landscape and stabilizing decision boundaries. The GFA regularizer jointly leverages multi-layer features and input gradients, is architecture-agnostic, and incurs no additional optimization overhead. On CIFAR-10, GFA improves robust accuracy by 15.2% against APGD and 24.7% against FGSM over standard adversarial training. Moreover, it increases the minimum perturbation magnitude required by strong iterative attacks—such as DeepFool—by 20–30×, demonstrating significantly enhanced robustness without compromising clean accuracy.

Technology Category

Application Category

📝 Abstract
Deep neural networks are known to be vulnerable to adversarial perturbations, which are small and carefully crafted inputs that lead to incorrect predictions. In this paper, we propose DeepDefense, a novel defense framework that applies Gradient-Feature Alignment (GFA) regularization across multiple layers to suppress adversarial vulnerability. By aligning input gradients with internal feature representations, DeepDefense promotes a smoother loss landscape in tangential directions, thereby reducing the model's sensitivity to adversarial noise. We provide theoretical insights into how adversarial perturbation can be decomposed into radial and tangential components and demonstrate that alignment suppresses loss variation in tangential directions, where most attacks are effective. Empirically, our method achieves significant improvements in robustness across both gradient-based and optimization-based attacks. For example, on CIFAR-10, CNN models trained with DeepDefense outperform standard adversarial training by up to 15.2% under APGD attacks and 24.7% under FGSM attacks. Against optimization-based attacks such as DeepFool and EADEN, DeepDefense requires 20 to 30 times higher perturbation magnitudes to cause misclassification, indicating stronger decision boundaries and a flatter loss landscape. Our approach is architecture-agnostic, simple to implement, and highly effective, offering a promising direction for improving the adversarial robustness of deep learning models.
Problem

Research questions and friction points this paper is trying to address.

Enhancing neural network robustness against adversarial attacks through gradient-feature alignment
Reducing model sensitivity to adversarial noise by smoothing loss landscape
Improving decision boundaries against gradient-based and optimization-based attacks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Layer-wise gradient-feature alignment regularization
Suppresses adversarial vulnerability across multiple layers
Promotes smoother loss landscape in tangential directions
🔎 Similar Papers
No similar papers found.
C
Ci Lin
School of Electrical Engineering and Computer Science, University of Ottawa
Tet Yeap
Tet Yeap
Professor of Electrical Engineering and Computer Science, University of Ottawa
Iluju Kiringa
Iluju Kiringa
Associate Professor, University of Ottawa
Data ManagementBusiness IntelligenceData SharingData IntegrationData Quality
B
Biwei Zhang
School of Electrical Engineering and Computer Science, University of Ottawa