SNEAKDOOR: Stealthy Backdoor Attacks against Distribution Matching-based Dataset Condensation

📅 2026-03-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the vulnerability of existing distribution-matching-based dataset condensation methods to backdoor attacks, which typically struggle to simultaneously achieve high attack success rates, maintain clean-sample accuracy, and ensure visual or statistical stealth. To overcome these limitations, we propose Sneakdoor, a novel approach that, for the first time, integrates input-aware triggers with local feature geometric alignment. By exploiting the inherent fragility of class decision boundaries, Sneakdoor achieves dual-stage concealment—both during dataset condensation and model inference. Extensive experiments across multiple benchmark datasets demonstrate that our method significantly enhances the imperceptibility of both trigger samples and synthesized data, while preserving high attack effectiveness and clean test accuracy.
📝 Abstract
Dataset condensation aims to synthesize compact yet informative datasets that retain the training efficacy of full-scale data, offering substantial gains in efficiency. Recent studies reveal that the condensation process can be vulnerable to backdoor attacks, where malicious triggers are injected into the condensation dataset, manipulating model behavior during inference. While prior approaches have made progress in balancing attack success rate and clean test accuracy, they often fall short in preserving stealthiness, especially in concealing the visual artifacts of condensed data or the perturbations introduced during inference. To address this challenge, we introduce Sneakdoor, which enhances stealthiness without compromising attack effectiveness. Sneakdoor exploits the inherent vulnerability of class decision boundaries and incorporates a generative module that constructs input-aware triggers aligned with local feature geometry, thereby minimizing detectability. This joint design enables the attack to remain imperceptible to both human inspection and statistical detection. Extensive experiments across multiple datasets demonstrate that Sneakdoor achieves a compelling balance among attack success rate, clean test accuracy, and stealthiness, substantially improving the invisibility of both the synthetic data and triggered samples while maintaining high attack efficacy. The code is available at https://github.com/XJTU-AI-Lab/SneakDoor.
Problem

Research questions and friction points this paper is trying to address.

backdoor attacks
dataset condensation
stealthiness
distribution matching
data synthesis
Innovation

Methods, ideas, or system contributions that make the work stand out.

backdoor attack
dataset condensation
stealthiness
input-aware trigger
distribution matching
🔎 Similar Papers
He Yang
He Yang
Xi'an Jiaotong University
Federated LearningDeep LearningPrivacy & Security
D
Dongyi Lv
School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China; National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi’an Jiaotong University, Xi’an, China
S
Song Ma
School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China; National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi’an Jiaotong University, Xi’an, China
W
Wei Xi
School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China; National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi’an Jiaotong University, Xi’an, China
J
Jizhong Zhao
School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China; National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi’an Jiaotong University, Xi’an, China