HtFLlib: A Comprehensive Heterogeneous Federated Learning Library and Benchmark

📅 2025-06-04

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

Existing federated learning (FL) benchmarks inadequately support fair evaluation and cross-scenario validation of heterogeneous FL (HtFL), primarily due to severe data, model, and algorithmic heterogeneity, coupled with the absence of a unified evaluation framework. Method: We propose the first open-source HtFL benchmark platform, integrating 12 cross-domain datasets, 40 multimodal model architectures, and 10 state-of-the-art HtFL algorithms. It features a modular Python framework enabling heterogeneous model alignment (e.g., knowledge distillation, feature mapping), multimodal data loading, and customizable federated scheduling. A standardized evaluation suite quantifies accuracy, convergence behavior, and computational/communication overhead. Contribution/Results: Empirically validated on real-world applications—including medical imaging and sensor signal processing—the platform demonstrates superior effectiveness and robustness, significantly improving collaborative efficiency among heterogeneous clients. The codebase is publicly released and has gained broad adoption in the research community.

Technology Category

Application Category

📝 Abstract

As AI evolves, collaboration among heterogeneous models helps overcome data scarcity by enabling knowledge transfer across institutions and devices. Traditional Federated Learning (FL) only supports homogeneous models, limiting collaboration among clients with heterogeneous model architectures. To address this, Heterogeneous Federated Learning (HtFL) methods are developed to enable collaboration across diverse heterogeneous models while tackling the data heterogeneity issue at the same time. However, a comprehensive benchmark for standardized evaluation and analysis of the rapidly growing HtFL methods is lacking. Firstly, the highly varied datasets, model heterogeneity scenarios, and different method implementations become hurdles to making easy and fair comparisons among HtFL methods. Secondly, the effectiveness and robustness of HtFL methods are under-explored in various scenarios, such as the medical domain and sensor signal modality. To fill this gap, we introduce the first Heterogeneous Federated Learning Library (HtFLlib), an easy-to-use and extensible framework that integrates multiple datasets and model heterogeneity scenarios, offering a robust benchmark for research and practical applications. Specifically, HtFLlib integrates (1) 12 datasets spanning various domains, modalities, and data heterogeneity scenarios; (2) 40 model architectures, ranging from small to large, across three modalities; (3) a modularized and easy-to-extend HtFL codebase with implementations of 10 representative HtFL methods; and (4) systematic evaluations in terms of accuracy, convergence, computation costs, and communication costs. We emphasize the advantages and potential of state-of-the-art HtFL methods and hope that HtFLlib will catalyze advancing HtFL research and enable its broader applications. The code is released at https://github.com/TsingZ0/HtFLlib.

Problem

Research questions and friction points this paper is trying to address.

Enabling collaboration among heterogeneous AI models in federated learning

Addressing data heterogeneity issues in federated learning scenarios

Providing a comprehensive benchmark for evaluating heterogeneous federated learning methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Comprehensive library for heterogeneous federated learning

Integrates multiple datasets and model heterogeneity scenarios

Modular codebase with 10 HtFL methods

🔎 Similar Papers

Advances in APPFL: A Comprehensive and Extensible Federated Learning Framework