Zero-Sacrifice Persistent-Robustness Adversarial Defense for Pre-Trained Encoders

📅 2026-02-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the vulnerability of pre-trained encoders to adversarial attacks that are irrelevant to downstream tasks, a problem exacerbated by existing defense methods that rely on task-specific fine-tuning, often resulting in poor generalization, catastrophic forgetting, and degradation of benign performance. To overcome these limitations, the authors propose ZePAD, a novel framework that, for the first time, enables a single tuning procedure to provide persistent defense against adversarial examples across diverse downstream tasks while simultaneously detecting adversarial inputs—without requiring any additional detection modules. ZePAD employs a dual-branch architecture: a multi-modal adversarial augmentation branch enhances robustness, while a benign memory retention branch preserves original performance, both trained using self-supervised encoders and local data. Extensive experiments across 11 self-supervised methods and 6 datasets demonstrate that ZePAD improves benign accuracy by up to 29.20% and adversarial robustness by up to 73.86%, achieving “zero-sacrifice” robust defense.

Technology Category

Application Category

📝 Abstract
The widespread use of publicly available pre-trained encoders from self-supervised learning (SSL) has exposed a critical vulnerability: their susceptibility to downstream-agnostic adversarial examples (DAEs), which are crafted without knowledge of the downstream tasks but capable of misleading downstream models. While several defense methods have been explored recently, they rely primarily on task-specific adversarial fine-tuning, which inevitably limits generalizability and causes catastrophic forgetting and deteriorates benign performance. Different with previous works, we propose a more rigorous defense goal that requires only a single tuning for diverse downstream tasks to defend against DAEs and preserve benign performance. To achieve this defense goal, we introduce Zero-Sacrifice Persistent-Robustness Adversarial Defense (ZePAD), which is inspired by the inherent sensitivity of neural networks to data characteristics. Specifically, ZePAD is a dual-branch structure, which consists of a Multi-Pattern Adversarial Enhancement Branch (MPAE-Branch) that uses two adversarially fine-tuned encoders to strengthen adversarial resistance. The Benign Memory Preservation Branch (BMP-Branch) is trained on local data to ensure adversarial robustness does not compromise benign performance. Surprisingly, we find that ZePAD can directly detect DAEs by evaluating branch confidence, without introducing any adversarial exsample identification task during training. Notably, by enriching feature diversity, our method enables a single adversarial fine-tuning to defend against DAEs across downstream tasks, thereby achieving persistent robustness. Extensive experiments on 11 SSL methods and 6 datasets validate its effectiveness. In certain cases, it achieves a 29.20% improvement in benign performance and a 73.86% gain in adversarial robustness, highlighting its zero-sacrifice property.
Problem

Research questions and friction points this paper is trying to address.

adversarial defense
pre-trained encoders
downstream-agnostic adversarial examples
persistent robustness
benign performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

persistent robustness
zero-sacrifice defense
downstream-agnostic adversarial examples
dual-branch architecture
self-supervised learning
🔎 Similar Papers
No similar papers found.
Z
Zhuxin Lei
School of Cyber Science and Engineering, Sichuan University; Key Laboratories of the Ministry of Education, Sichuan University; Tianfu Jiangxi Laboratory
Ziyuan Yang
Ziyuan Yang
The Chinese University of Hong Kong
CVMedical ImagingSecurity & PrivacyEfficient Learning
Yi Zhang
Yi Zhang
Sichuan University
Medical Imaging