Mellivora Capensis: A Backdoor-Free Training Framework on the Poisoned Dataset without Auxiliary Data

📅 2024-05-21

📈 Citations: 1

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Backdoor attacks in online data collection scenarios pose severe security risks by hijacking models, yet existing defenses rely on clean or auxiliary data, suffer from poor generalizability, and are vulnerable to adaptive attacks. Method: We propose the first robust training framework requiring neither clean nor auxiliary data. Leveraging the newly discovered anomalous perturbation robustness of poisoned samples, we establish a clean-data-agnostic backdoor immunity paradigm. Our approach integrates theory-driven perturbation robustness modeling, trigger-perturbation disentanglement, unsupervised confidence estimation, and adaptive weighted training for end-to-end defense. Results: Evaluated against mainstream attacks—including BadNets, Blend, and SIG—our method achieves >98% backdoor removal rate with <1.2% task accuracy degradation, significantly outperforming state-of-the-art defenses across diverse benchmarks and threat models.

Technology Category

Application Category

📝 Abstract

The efficacy of deep learning models is profoundly influenced by the quality of their training data. Given the considerations of data diversity, data scale, and annotation expenses, model trainers frequently resort to sourcing and acquiring datasets from online repositories. Although economically pragmatic, this strategy exposes the models to substantial security vulnerabilities. Untrusted entities can clandestinely embed triggers within the dataset, facilitating the hijacking of the trained model on the poisoned dataset through backdoor attacks, which constitutes a grave security concern. Despite the proliferation of countermeasure research, their inherent limitations constrain their effectiveness in practical applications. These include the requirement for substantial quantities of clean samples, inconsistent defense performance across varying attack scenarios, and inadequate resilience against adaptive attacks, among others. Therefore, in this paper, we endeavor to address the challenges of backdoor attack countermeasures in real-world scenarios, thereby fortifying the security of training paradigm under the data-collection manner. Concretely, we first explore the inherent relationship between the potential perturbations and the backdoor trigger, and demonstrate the key observation that the poisoned samples perform more robustness to perturbation than the clean ones through the theoretical analysis and experiments. Then, based on our key explorations, we propose a robust and clean-data-free backdoor defense framework, namely Mellivora Capensis ( exttt{MeCa}), which enables the model trainer to train a clean model on the poisoned dataset.

Problem

Research questions and friction points this paper is trying to address.

Addresses backdoor attack vulnerabilities in deep learning training datasets

Eliminates need for clean auxiliary data during backdoor defense

Enables secure model training directly on poisoned datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Backdoor-free training without auxiliary data

Leverages poisoned samples' robustness to perturbation

Enables clean model training on poisoned datasets

🔎 Similar Papers

No similar papers found.