SifterNet: A Generalized and Model-Agnostic Trigger Purification Approach

📅 2025-05-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of defending against backdoor attacks on CNNs and Vision Transformers (ViTs) in black-box settings. We propose a universal trigger purification method that requires no knowledge of the target model, neither clean samples nor model retraining. Our approach innovatively integrates the Ising model into backdoor detection and leverages the memory-association mechanism of Hopfield networks to construct a model-agnostic, lightweight framework for trigger pattern modeling and adaptive purification. Evaluated across multiple benchmark datasets, the method consistently outperforms state-of-the-art defenses: it achieves a 12.6%–23.4% improvement in trigger removal success rate while preserving original classification accuracy. To our knowledge, this is the first black-box backdoor purification paradigm grounded in physical statistical modeling, establishing a novel, theoretically principled direction for robustness-aware vision model security.

Technology Category

Application Category

📝 Abstract
Aiming at resisting backdoor attacks in convolution neural networks and vision Transformer-based large model, this paper proposes a generalized and model-agnostic trigger-purification approach resorting to the classic Ising model. To date, existing trigger detection/removal studies usually require to know the detailed knowledge of target model in advance, access to a large number of clean samples or even model-retraining authorization, which brings the huge inconvenience for practical applications, especially inaccessible to target model. An ideal countermeasure ought to eliminate the implanted trigger without regarding whatever the target models are. To this end, a lightweight and black-box defense approach SifterNet is proposed through leveraging the memorization-association functionality of Hopfield network, by which the triggers of input samples can be effectively purified in a proper manner. The main novelty of our proposed approach lies in the introduction of ideology of Ising model. Extensive experiments also validate the effectiveness of our approach in terms of proper trigger purification and high accuracy achievement, and compared to the state-of-the-art baselines under several commonly-used datasets, our SiferNet has a significant superior performance.
Problem

Research questions and friction points this paper is trying to address.

Resisting backdoor attacks in CNN and Transformer models
Purifying triggers without target model knowledge
Achieving high accuracy without model retraining
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Ising model for trigger purification
Leverages Hopfield network memorization-association
Model-agnostic black-box defense approach
🔎 Similar Papers
2024-06-27Journal of Mathematical & Computer ApplicationsCitations: 2
S
Shaoye Luo
Institute of Computing Technology, Chinese Academy of Sciences; University of the Chinese Academy of Sciences
Xinxin Fan
Xinxin Fan
IoTeX - Building MachineFi For Web3
Applied CryptographyBlockchainWeb3IoT SecurityConfidential Computing
Q
Quanliang Jing
Institute of Computing Technology, Chinese Academy of Sciences; University of the Chinese Academy of Sciences
Chi Lin
Chi Lin
School of Software Technology, Dalian University of Technology
Wireless Sensor NetworksCyber Physical SystemPervasive Computing
M
Mengfan Li
Institute of Computing Technology, Chinese Academy of Sciences; University of the Chinese Academy of Sciences
Y
Yunfeng Lu
Beihang University
Y
Yongjun Xu
Institute of Computing Technology, Chinese Academy of Sciences; University of the Chinese Academy of Sciences