AdaGAT: Adaptive Guidance Adversarial Training for the Robustness of Deep Neural Networks

📅 2025-08-24

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

To address teacher model degradation and inefficient robust knowledge transfer in adversarial distillation, this paper proposes AdaGAT—a novel adaptive guided adversarial distillation framework. Its core innovation lies in a dual-separate loss function that dynamically regulates the training state of a learnable teacher model, enabling it to maintain optimal robustness throughout co-training with the student and actively participate in gradient optimization. By integrating adversarial distillation with an adaptive gradient guidance mechanism, AdaGAT is evaluated on WideResNet-34-10 across CIFAR-10, CIFAR-100, and TinyImageNet. Experiments demonstrate that AdaGAT significantly improves student model robustness against strong adversarial attacks—including PGD and AutoAttack—achieving average gains of 2.1–4.7 percentage points, while preserving high natural accuracy. The method consistently outperforms state-of-the-art adversarial distillation baselines across all benchmarks.

Technology Category

Application Category

📝 Abstract

Adversarial distillation (AD) is a knowledge distillation technique that facilitates the transfer of robustness from teacher deep neural network (DNN) models to lightweight target (student) DNN models, enabling the target models to perform better than only training the student model independently. Some previous works focus on using a small, learnable teacher (guide) model to improve the robustness of a student model. Since a learnable guide model starts learning from scratch, maintaining its optimal state for effective knowledge transfer during co-training is challenging. Therefore, we propose a novel Adaptive Guidance Adversarial Training (AdaGAT) method. Our method, AdaGAT, dynamically adjusts the training state of the guide model to install robustness to the target model. Specifically, we develop two separate loss functions as part of the AdaGAT method, allowing the guide model to participate more actively in backpropagation to achieve its optimal state. We evaluated our approach via extensive experiments on three datasets: CIFAR-10, CIFAR-100, and TinyImageNet, using the WideResNet-34-10 model as the target model. Our observations reveal that appropriately adjusting the guide model within a certain accuracy range enhances the target model's robustness across various adversarial attacks compared to a variety of baseline models.

Problem

Research questions and friction points this paper is trying to address.

Enhancing student model robustness via adaptive teacher guidance

Dynamically adjusting guide model state during co-training

Improving adversarial attack resistance through optimized knowledge transfer

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive guide model training state adjustment

Two separate loss functions for active backpropagation

Dynamic robustness installation to target model

🔎 Similar Papers

No similar papers found.