FedEP: Tailoring Attention to Heterogeneous Data Distribution with Entropy Pooling for Decentralized Federated Learning

📅 2024-10-10
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF

career value

223K/year
🤖 AI Summary
To address node drift, slow convergence, and low accuracy caused by non-IID data in decentralized federated learning (DFL), this paper proposes FedEP—a novel algorithm that introduces entropy pooling, a technique originally from finance, into DFL for the first time. FedEP employs Gaussian Mixture Models (GMMs) to model local data distributions and aggregates knowledge via lightweight statistical parameters—rather than raw gradients or model weights—enabling privacy-preserving distributed consensus without exposing original data. The method ensures strict data locality, achieves ~40% lower communication overhead compared to state-of-the-art (SOTA) approaches, and simultaneously guarantees strong privacy and global model consistency. Extensive experiments demonstrate that FedEP converges significantly faster and attains substantially higher test accuracy across diverse non-IID settings, outperforming existing SOTA methods.

Technology Category

Application Category

📝 Abstract
Non-Independent and Identically Distributed (non-IID) data in Federated Learning (FL) causes client drift issues, leading to slower convergence and reduced model performance. While existing approaches mitigate this issue in Centralized FL (CFL) using a central server, Decentralized FL (DFL) remains underexplored. In DFL, the absence of a central entity results in nodes accessing a global view of the federation, further intensifying the challenges of non-IID data. Drawing on the entropy pooling algorithm employed in financial contexts to synthesize diverse investment opinions, this work proposes the Federated Entropy Pooling (FedEP) algorithm to mitigate the non-IID challenge in DFL. FedEP leverages Gaussian Mixture Models (GMM) to fit local data distributions, sharing statistical parameters among neighboring nodes to estimate the global distribution. Aggregation weights are determined using the entropy pooling approach between local and global distributions. By sharing only synthetic distribution information, FedEP preserves data privacy while minimizing communication overhead. Experimental results demonstrate that FedEP achieves faster convergence and outperforms state-of-the-art methods in various non-IID settings.
Problem

Research questions and friction points this paper is trying to address.

Decentralized Federated Learning
Non-IID Data
Model Training Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

FedEP
Entropy Pooling
Non-IID Data