CLIP-Guided Backdoor Defense through Entropy-Based Poisoned Dataset Separation

📅 2025-07-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deep neural networks (DNNs) are highly vulnerable to backdoor attacks—including sophisticated variants such as clean-label and clean-image attacks—while existing defenses suffer from high computational overhead and insufficient robustness. To address this, we propose an efficient backdoor defense framework. Our approach is the first to leverage a publicly available CLIP model, exploiting its cross-modal semantic understanding to detect potentially poisoned samples. We further integrate entropy-based analysis for unsupervised separation of contaminated data and employ logits-guided lightweight retraining to eliminate backdoors. Evaluated across four benchmark datasets under eleven diverse attacks, our method reduces attack success rates to below 1%, with clean accuracy degradation no greater than 0.3%. It significantly outperforms state-of-the-art defenses, achieving superior robustness, low computational cost, and strong generalization capability.

Technology Category

Application Category

📝 Abstract
Deep Neural Networks (DNNs) are susceptible to backdoor attacks, where adversaries poison training data to implant backdoor into the victim model. Current backdoor defenses on poisoned data often suffer from high computational costs or low effectiveness against advanced attacks like clean-label and clean-image backdoors. To address them, we introduce CLIP-Guided backdoor Defense (CGD), an efficient and effective method that mitigates various backdoor attacks. CGD utilizes a publicly accessible CLIP model to identify inputs that are likely to be clean or poisoned. It then retrains the model with these inputs, using CLIP's logits as a guidance to effectively neutralize the backdoor. Experiments on 4 datasets and 11 attack types demonstrate that CGD reduces attack success rates (ASRs) to below 1% while maintaining clean accuracy (CA) with a maximum drop of only 0.3%, outperforming existing defenses. Additionally, we show that clean-data-based defenses can be adapted to poisoned data using CGD. Also, CGD exhibits strong robustness, maintaining low ASRs even when employing a weaker CLIP model or when CLIP itself is compromised by a backdoor. These findings underscore CGD's exceptional efficiency, effectiveness, and applicability for real-world backdoor defense scenarios. Code: https://github.com/binyxu/CGD.
Problem

Research questions and friction points this paper is trying to address.

Defends DNNs against diverse backdoor attacks efficiently
Uses CLIP to separate poisoned and clean training data
Maintains high clean accuracy while neutralizing backdoors effectively
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses CLIP model to separate poisoned data
Retrains model with CLIP-guided logits
Maintains high accuracy and low attack rates
B
Binyan Xu
The Chinese University of Hong Kong
F
Fan Yang
The Chinese University of Hong Kong
X
Xilin Dai
Zhejiang University
D
Di Tang
Sun Yat-sen University
Kehuan Zhang
Kehuan Zhang
The Chinese University of Hong Kong
Security of Computer systemsWebMobileCloudEmbedded System