π€ AI Summary
This work addresses the significant performance degradation of federated learning under data heterogeneity and client-side label noise, where conventional loss-based noise identification methods often prove unreliable. The authors propose a novel "representation geometry-first" principle, introducing spherical representations and a von MisesβFisher (vMF) mixture model into federated noisy-label learning for the first time. By leveraging self-supervision to construct label-agnostic spherical features, the method iteratively fits a vMF mixture model to capture semantic clusters. It further integrates a soft semantic-to-label mapping mechanism with personalized noise absorption matrices to robustly identify noisy samples. Extensive experiments demonstrate that this approach consistently outperforms state-of-the-art methods across diverse noisy and heterogeneous settings, substantially enhancing both accuracy and robustness in federated learning.
π Abstract
Federated learning (FL) suffers from performance degradation due to the inevitable presence of noisy annotations in distributed scenarios. Existing approaches have advanced in distinguishing noisy samples from the dataset for label correction by leveraging loss values. However, noisy samples recognition relying on scalar loss lacks reliability for FL under heterogeneous scenarios. In this paper, we rethink this paradigm from a representation perspective and propose \method~(\textbf{Fed}erated under \textbf{R}epresentation \textbf{G}emometry), which follows \textbf{the principle of ``representation geometry priority''} to recognize noisy labels. Firstly, \method~creates label-agnostic spherical representations by using self-supervision. It then iteratively fits a spherical von Mises-Fisher (vMF) mixture model to this geometry using previously identified clean samples to capture semantic clusters. This geometric evidence is integrated with a semantic-label soft mapping mechanism to derive a distribution divergence between the label-free and annotated label-conditioned feature space, which robustly identifies noisy samples and updates the vMF mixture model with the newly separated clean dataset. Lastly, we employ an additional personalized noise absorption matrix on noisy labels to achieve robust optimization. Extensive experimental results demonstrate that \method~significantly outperforms state-of-the-art methods for FL with data heterogeneity under diverse noisy clients scenarios.