Federated Active Learning Under Extreme Non-IID and Global Class Imbalance

📅 2026-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of low labeling efficiency and degraded model performance in federated active learning under extreme non-IID settings with global class imbalance. The authors propose FairFAL, a novel framework that reveals—for the first time—the critical role of query models in enabling class-balanced sampling. FairFAL employs lightweight prediction discrepancy analysis to adaptively infer both global class imbalance and local-global distribution shifts, dynamically selecting between global or local query models accordingly. Integrating a prototype-guided pseudo-labeling mechanism with a two-stage uncertainty-diversity balanced sampling strategy based on k-center, FairFAL significantly improves model accuracy and class coverage on long-tailed heterogeneous data. Extensive experiments demonstrate consistent superiority over existing methods across five benchmark datasets.

Technology Category

Application Category

📝 Abstract
Federated active learning (FAL) seeks to reduce annotation cost under privacy constraints, yet its effectiveness degrades in realistic settings with severe global class imbalance and highly heterogeneous clients. We conduct a systematic study of query-model selection in FAL and uncover a central insight: the model that achieves more class-balanced sampling, especially for minority classes, consistently leads to better final performance. Moreover, global-model querying is beneficial only when the global distribution is highly imbalanced and client data are relatively homogeneous; otherwise, the local model is preferable. Based on these findings, we propose FairFAL, an adaptive class-fair FAL framework. FairFAL (1) infers global imbalance and local-global divergence via lightweight prediction discrepancy, enabling adaptive selection between global and local query models; (2) performs prototype-guided pseudo-labeling using global features to promote class-aware querying; and (3) applies a two-stage uncertainty-diversity balanced sampling strategy with k-center refinement. Experiments on five benchmarks show that FairFAL consistently outperforms state-of-the-art approaches under challenging long-tailed and non-IID settings. The code is available at https://github.com/chenchenzong/FairFAL.
Problem

Research questions and friction points this paper is trying to address.

Federated Active Learning
Non-IID
Class Imbalance
Annotation Cost
Heterogeneous Clients
Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Active Learning
Class Imbalance
Non-IID
Adaptive Querying
Prototype-guided Pseudo-labeling
🔎 Similar Papers
No similar papers found.