FedAVOT: Exact Distribution Alignment in Federated Learning via Masked Optimal Transport

📅 2025-09-17

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

In federated learning, partial client participation induces a mismatch between the availability distribution $q$ and the importance distribution $p$, causing bias and unstable convergence in FedAvg. To address this, we propose FedAVOT—the first framework to incorporate optimal transport (OT) into federated aggregation. FedAVOT employs masked OT to align $p$ and $q$, and leverages Sinkhorn scaling for efficient computation of transport weights. Theoretically, under nonsmooth convex settings, FedAVOT achieves a convergence rate independent of the number of participating clients, with provable guarantees even in extreme sparsity scenarios involving as few as two clients. Empirically, FedAVOT significantly outperforms FedAvg on highly heterogeneous, low-availability, and fairness-sensitive tasks—demonstrating markedly improved training stability and generalization performance.

Technology Category

Application Category

📝 Abstract

Federated Learning (FL) allows distributed model training without sharing raw data, but suffers when client participation is partial. In practice, the distribution of available users (emph{availability distribution} $q$) rarely aligns with the distribution defining the optimization objective (emph{importance distribution} $p$), leading to biased and unstable updates under classical FedAvg. We propose extbf{Fereated AVerage with Optimal Transport ( extbf{FedAVOT})}, which formulates aggregation as a masked optimal transport problem aligning $q$ and $p$. Using Sinkhorn scaling, extbf{FedAVOT} computes transport-based aggregation weights with provable convergence guarantees. extbf{FedAVOT} achieves a standard $mathcal{O}(1/sqrt{T})$ rate under a nonsmooth convex FL setting, independent of the number of participating users per round. Our experiments confirm drastically improved performance compared to FedAvg across heterogeneous, fairness-sensitive, and low-availability regimes, even when only two clients participate per round.

Problem

Research questions and friction points this paper is trying to address.

Aligns client and objective distributions in federated learning

Solves biased updates from partial client participation

Uses masked optimal transport for aggregation weights

Innovation

Methods, ideas, or system contributions that make the work stand out.

Masked optimal transport for distribution alignment

Sinkhorn scaling for aggregation weights

Convergence guarantees with provable rates

🔎 Similar Papers

No similar papers found.