🤖 AI Summary
This work addresses the challenge in over-the-air federated learning (OTA-FL) systems where wireless channel superposition prevents the parameter server from observing individual client updates, making it difficult to distinguish benign gradient drift caused by non-IID data from stealthy backdoor attacks. To tackle this issue, the authors propose a two-stage robust aggregation framework. First, a modality-aware, multi-metric trust scoring mechanism is introduced, coupled with trust-based multiple access (TBMA) to classify clients. Subsequently, the parameter server performs layer-wise anomaly detection and longitudinal reputation evaluation to dynamically identify and suppress malicious updates. Experimental results demonstrate that the proposed method effectively defends against sophisticated backdoor attacks—including Neurotoxin and cosine similarity-constrained attacks—across multiple benchmark datasets while maintaining competitive performance on the main learning task.
📝 Abstract
Over-the-air federated learning (OTA-FL) improves communication efficiency by exploiting the superposition property of wireless channels, but this same property also creates a critical security vulnerability: the parameter server (PS) cannot access individual local updates, making it difficult to identify and exclude poisoned gradients. The challenge is further exacerbated under non-independent and identically distributed (Non-IID) training data, where benign gradient drift can closely resemble malicious updates. In this paper, we propose a two-stage robust aggregation framework for defending against backdoor attacks in OTA-FL. Under our scheme, each client is first assigned a modality-aware multi-indicator trust score, where the specific indicators are selected according to the data modality (e.g., waveform, text, image) and model architecture to capture the most discriminative footprint of backdoor updates. Based on this score, the PS then performs trust-based multiple access (TBMA) to separate clients into trusted, suspicious, and malicious categories. Suspicious clients are further examined through PS-side layer-wise inspection and a longitudinal reputation mechanism. Experimental results on several datasets demonstrate that the proposed methodology effectively suppresses stealthy backdoor attacks, including bounded-scaling attacks, Euclidean-constrained attacks, Cosine-constrained attacks, and Neurotoxin, while maintaining competitive main-task accuracy.