🤖 AI Summary
In federated learning, anomalous or malicious clients—caused by sensor failures or non-IID data distributions—severely degrade global model performance; however, offline detection is highly challenging due to inaccessibility of raw client data. This paper proposes WAFFLE, the first framework enabling *pre-training offline anomaly detection*: it extracts compact, task-agnostic, non-invertible, and deformation-stable local representations via Wavelet Scattering Transform (WST) or Fourier Transform; then applies unsupervised embedding clustering coupled with a lightweight distilled detector—requiring neither raw data nor model updates. WST-based representations significantly enhance robustness and privacy compliance. Extensive experiments across multiple benchmark datasets demonstrate that WAFFLE substantially outperforms state-of-the-art online detection methods, achieving higher malicious-client identification accuracy and superior downstream classification performance.
📝 Abstract
Federated Learning (FL) enables the training of machine learning models across decentralized clients while preserving data privacy. However, the presence of anomalous or corrupted clients - such as those with faulty sensors or non representative data distributions - can significantly degrade model performance. Detecting such clients without accessing raw data remains a key challenge. We propose WAFFLE (Wavelet and Fourier representations for Federated Learning) a detection algorithm that labels malicious clients {it before training}, using locally computed compressed representations derived from either the Wavelet Scattering Transform (WST) or the Fourier Transform. Both approaches provide low-dimensional, task-agnostic embeddings suitable for unsupervised client separation. A lightweight detector, trained on a distillated public dataset, performs the labeling with minimal communication and computational overhead. While both transforms enable effective detection, WST offers theoretical advantages, such as non-invertibility and stability to local deformations, that make it particularly well-suited to federated scenarios. Experiments on benchmark datasets show that our method improves detection accuracy and downstream classification performance compared to existing FL anomaly detection algorithms, validating its effectiveness as a pre-training alternative to online detection strategies.