๐ค AI Summary
Split learning is vulnerable to backdoor attacks, wherein malicious clients inject hidden triggers by manipulating embedded vectors, thereby compromising model security. To address this threat, this work proposes SecureSplitโthe first defense mechanism specifically designed for split learning. SecureSplit innovatively combines dimensionality transformation to amplify the divergence between benign and poisoned embeddings and introduces an adaptive filtering strategy based on majority voting to effectively identify and eliminate contaminated data. Extensive experiments across four datasets, five distinct backdoor attacks, and seven baseline methods demonstrate that SecureSplit significantly enhances model robustness under diverse and challenging attack scenarios.
๐ Abstract
Split Learning (SL) offers a framework for collaborative model training that respects data privacy by allowing participants to share the same dataset while maintaining distinct feature sets. However, SL is susceptible to backdoor attacks, in which malicious clients subtly alter their embeddings to insert hidden triggers that compromise the final trained model. To address this vulnerability, we introduce SecureSplit, a defense mechanism tailored to SL. SecureSplit applies a dimensionality transformation strategy to accentuate subtle differences between benign and poisoned embeddings, facilitating their separation. With this enhanced distinction, we develop an adaptive filtering approach that uses a majority-based voting scheme to remove contaminated embeddings while preserving clean ones. Rigorous experiments across four datasets (CIFAR-10, MNIST, CINIC-10, and ImageNette), five backdoor attack scenarios, and seven alternative defenses confirm the effectiveness of SecureSplit under various challenging conditions.