Filter, Obstruct and Dilute: Defending Against Backdoor Attacks on Semi-Supervised Learning

๐Ÿ“… 2025-02-09
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Semi-supervised learning (SSL) is highly vulnerable to data-poisoning backdoor attacks, where even minimal contamination can yield attack success rates up to 90%. This work first systematically reveals that backdoor attacks in SSL exploit spurious correlations between triggers and target classes. Building on this insight, we propose a novel end-to-end defense framework comprising three stages: *filter*, *block*, and *dilute*. Specifically, Gaussian filtering suppresses trigger-responsive activations; complementary learning decouples semantic features from backdoor features; and trigger mix-up augmentation dilutes spurious triggerโ€“target associations. Our method enjoys theoretical generalization guarantees and incurs zero clean accuracy degradation. Under multiple state-of-the-art backdoor attacks, it reduces the average attack success rate from 84.7% to 1.8%, significantly outperforming existing defenses.

Technology Category

Application Category

๐Ÿ“ Abstract
Recent studies have verified that semi-supervised learning (SSL) is vulnerable to data poisoning backdoor attacks. Even a tiny fraction of contaminated training data is sufficient for adversaries to manipulate up to 90% of the test outputs in existing SSL methods. Given the emerging threat of backdoor attacks designed for SSL, this work aims to protect SSL against such risks, marking it as one of the few known efforts in this area. Specifically, we begin by identifying that the spurious correlations between the backdoor triggers and the target class implanted by adversaries are the primary cause of manipulated model predictions during the test phase. To disrupt these correlations, we utilize three key techniques: Gaussian Filter, complementary learning and trigger mix-up, which collectively filter, obstruct and dilute the influence of backdoor attacks in both data pre-processing and feature learning. Experimental results demonstrate that our proposed method, Backdoor Invalidator (BI), significantly reduces the average attack success rate from 84.7% to 1.8% across different state-of-the-art backdoor attacks. It is also worth mentioning that BI does not sacrifice accuracy on clean data and is supported by a theoretical guarantee of its generalization capability.
Problem

Research questions and friction points this paper is trying to address.

Defends against backdoor attacks in semi-supervised learning.
Reduces spurious correlations caused by backdoor triggers.
Maintains accuracy on clean data during defense.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian Filter for data pre-processing
Complementary learning to obstruct attacks
Trigger mix-up to dilute backdoor influence
๐Ÿ”Ž Similar Papers
No similar papers found.
X
Xinrui Wang
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics; MIIT Key Laboratory of Pattern Analysis and Machine Intelligence
Chuanxing Geng
Chuanxing Geng
Nanjing University of Aeronautics and Astronautics
Machine LearningPattern Recognition
W
Wenhai Wan
School of Computer Science and Technology, Huazhong University of Science and Technology
S
Shao-yuan Li
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics; MIIT Key Laboratory of Pattern Analysis and Machine Intelligence
Songcan Chen
Songcan Chen
Nanjing University of Aeronautics & Astronautics
Machine LearningPattern recognition