🤖 AI Summary
In federated learning, iterative model update exchanges simultaneously exacerbate cumulative privacy leakage and communication latency, which jointly constrain model accuracy. To address this, we propose a privacy-aware proactive user selection mechanism that— for the first time—unifies differential privacy budget constraints and low-latency objectives into a multi-objective reward function. We design a privacy-aware multi-armed bandit (MAB) algorithm with provable convergence guarantees and incorporate simulated annealing for efficient approximate optimization. Evaluated across multiple benchmark scenarios, our method significantly reduces end-to-end latency and cumulative privacy loss while preserving or improving model accuracy. Theoretically, we prove that the expected reward growth rate achieves the optimal MAB order Ω(√T), thereby enabling synergistic optimization of privacy, latency, and accuracy.
📝 Abstract
Federated learning (FL) enables multiple edge devices to collaboratively train a machine learning model without the need to share potentially private data. Federated learning proceeds through iterative exchanges of model updates, which pose two key challenges: First, the accumulation of privacy leakage over time, and second, communication latency. These two limitations are typically addressed separately: The former via perturbed updates to enhance privacy and the latter using user selection to mitigate latency - both at the expense of accuracy. In this work, we propose a method that jointly addresses the accumulation of privacy leakage and communication latency via active user selection, aiming to improve the trade-off among privacy, latency, and model performance. To achieve this, we construct a reward function that accounts for these three objectives. Building on this reward, we propose a multi-armed bandit (MAB)-based algorithm, termed Privacy-aware Active User SElection (PAUSE) which dynamically selects a subset of users each round while ensuring bounded overall privacy leakage. We establish a theoretical analysis, systematically showing that the reward growth rate of PAUSE follows that of the best-known rate in MAB literature. To address the complexity overhead of active user selection, we propose a simulated annealing-based relaxation of PAUSE and analyze its ability to approximate the reward-maximizing policy under reduced complexity. We numerically validate the privacy leakage, associated improved latency, and accuracy gains of our methods for the federated training in various scenarios.