FedAVOT: Exact Distribution Alignment in Federated Learning via Masked Optimal Transport

📅 2025-09-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In federated learning, partial client participation induces a mismatch between the availability distribution $q$ and the importance distribution $p$, causing bias and unstable convergence in FedAvg. To address this, we propose FedAVOT—the first framework to incorporate optimal transport (OT) into federated aggregation. FedAVOT employs masked OT to align $p$ and $q$, and leverages Sinkhorn scaling for efficient computation of transport weights. Theoretically, under nonsmooth convex settings, FedAVOT achieves a convergence rate independent of the number of participating clients, with provable guarantees even in extreme sparsity scenarios involving as few as two clients. Empirically, FedAVOT significantly outperforms FedAvg on highly heterogeneous, low-availability, and fairness-sensitive tasks—demonstrating markedly improved training stability and generalization performance.

Technology Category

Application Category

📝 Abstract
Federated Learning (FL) allows distributed model training without sharing raw data, but suffers when client participation is partial. In practice, the distribution of available users (emph{availability distribution} $q$) rarely aligns with the distribution defining the optimization objective (emph{importance distribution} $p$), leading to biased and unstable updates under classical FedAvg. We propose extbf{Fereated AVerage with Optimal Transport ( extbf{FedAVOT})}, which formulates aggregation as a masked optimal transport problem aligning $q$ and $p$. Using Sinkhorn scaling, extbf{FedAVOT} computes transport-based aggregation weights with provable convergence guarantees. extbf{FedAVOT} achieves a standard $mathcal{O}(1/sqrt{T})$ rate under a nonsmooth convex FL setting, independent of the number of participating users per round. Our experiments confirm drastically improved performance compared to FedAvg across heterogeneous, fairness-sensitive, and low-availability regimes, even when only two clients participate per round.
Problem

Research questions and friction points this paper is trying to address.

Aligns client and objective distributions in federated learning
Solves biased updates from partial client participation
Uses masked optimal transport for aggregation weights
Innovation

Methods, ideas, or system contributions that make the work stand out.

Masked optimal transport for distribution alignment
Sinkhorn scaling for aggregation weights
Convergence guarantees with provable rates
🔎 Similar Papers
No similar papers found.
H
Herlock (SeyedAbolfazl) Rahimi
Department of Electrical and Computer Engineering, Yale University, New Haven, USA
Dionysis Kalogerias
Dionysis Kalogerias
Assistant Professor of ECE, SEAS @ Yale University
OptimizationRiskStatistical LearningDecision under UncertaintySignal Processing