FL-PBM: Pre-Training Backdoor Mitigation for Federated Learning

📅 2026-03-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the critical threat posed by backdoor attacks during the pretraining phase of federated learning, which severely compromises model integrity. To counter this, the authors propose FL-PBM, a novel defense mechanism that proactively sanitizes poisoned data at client-side pretraining. FL-PBM establishes a detection baseline by injecting benign triggers, then leverages PCA for feature extraction and Gaussian Mixture Model (GMM) clustering to identify suspicious samples. It further employs targeted blurring to neutralize potential backdoor triggers. Notably, FL-PBM is the first approach to achieve lossless and efficient backdoor defense in federated pretraining through benign trigger guidance. Experimental results demonstrate that FL-PBM reduces backdoor attack success rates by up to 95% compared to FedAvg and significantly outperforms state-of-the-art defenses RDFL and LPSF by 30%–80%, while maintaining clean-data accuracy above 90%.
📝 Abstract
Backdoor attacks pose a significant threat to the integrity and reliability of Artificial Intelligence (AI) models, enabling adversaries to manipulate model behavior by injecting poisoned data with hidden triggers. These attacks can lead to severe consequences, especially in critical applications such as autonomous driving, healthcare, and finance. Detecting and mitigating backdoor attacks is crucial across the lifespan of model's phases, including pre-training, in-training, and post-training. In this paper, we propose Pre-Training Backdoor Mitigation for Federated Learning (FL-PBM), a novel defense mechanism that proactively filters poisoned data on the client side before model training in a federated learning (FL) environment. The approach consists of three stages: (1) inserting a benign trigger into the data to establish a controlled baseline, (2) applying Principal Component Analysis (PCA) to extract discriminative features and assess the separability of the data, (3) performing Gaussian Mixture Model (GMM) clustering to identify potentially malicious data samples based on their distribution in the PCA-transformed space, and (4) applying a targeted blurring technique to disrupt potential backdoor triggers. Together, these steps ensure that suspicious data is detected early and sanitized effectively, thereby minimizing the influence of backdoor triggers on the global model. Experimental evaluations on image-based datasets demonstrate that FL-PBM reduces attack success rates by up to 95% compared to baseline federated learning (FedAvg) and by 30 to 80% relative to state-of-the-art defenses (RDFL and LPSF). At the same time, it maintains over 90% clean model accuracy in most experiments, achieving better mitigation without degrading model performance.
Problem

Research questions and friction points this paper is trying to address.

backdoor attacks
federated learning
pre-training mitigation
data poisoning
model integrity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Learning
Backdoor Mitigation
Pre-Training Defense
PCA-GMM Clustering
Trigger Sanitization
🔎 Similar Papers
No similar papers found.