π€ AI Summary
To address the prevalent decision-boundary oscillation and overfitting in adversarial training of deep neural networks, this paper proposes Parameter Interpolation Adversarial Training (PIAT). PIAT introduces two key innovations: (1) cross-epoch model parameter interpolation to smooth decision-boundary evolution, and (2) a normalized mean squared error (NMSE) loss that aligns the relative magnitudes of logits between clean and adversarial samples, thereby enhancing feature consistency. The framework seamlessly integrates into standard adversarial training pipelines and is compatible with both CNNs and Vision Transformers. Evaluated on CIFAR-10/100 and ImageNet, PIAT consistently improves robust accuracy by 2.3β5.7%, accelerates convergence, and suppresses training instability. These results demonstrate PIATβs dual advantages in adversarial robustness and generalization performance.
π Abstract
Though deep neural networks exhibit superior performance on various tasks, they are still plagued by adversarial examples. Adversarial training has been demonstrated to be the most effective method to defend against adversarial attacks. However, existing adversarial training methods show that the model robustness has apparent oscillations and overfitting issues in the training process, degrading the defense efficacy. To address these issues, we propose a novel framework called Parameter Interpolation Adversarial Training (PIAT). PIAT tunes the model parameters between each epoch by interpolating the parameters of the previous and current epochs. It makes the decision boundary of model change more moderate and alleviates the overfitting issue, helping the model converge better and achieving higher model robustness. In addition, we suggest using the Normalized Mean Square Error (NMSE) to further improve the robustness by aligning the relative magnitude of logits between clean and adversarial examples rather than the absolute magnitude. Extensive experiments conducted on several benchmark datasets demonstrate that our framework could prominently improve the robustness of both Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs).