🤖 AI Summary
Federated learning faces dual challenges of statistical heterogeneity and rigorous privacy protection. Method: This paper proposes Federated Differential Privacy (FDP), a novel paradigm theoretically positioned between local and central differential privacy. For decentralized, heterogeneous settings without a trusted server, we design a federated transfer learning framework integrating transfer learning with differential privacy to enable cross-domain knowledge transfer. Contribution/Results: Leveraging minimax statistical analysis, we rigorously derive optimal convergence rates for mean estimation and low- and high-dimensional linear regression under FDP—first quantifying the fundamental trade-off among data heterogeneity, privacy budget (ε), and transfer gain. Empirical results demonstrate that our framework significantly improves target-domain model performance while guaranteeing strict ε-differential privacy, achieving both theoretical optimality and practical feasibility.
📝 Abstract
Federated learning has emerged as a powerful framework for analysing distributed data, yet two challenges remain pivotal: heterogeneity across sites and privacy of local data. In this paper, we address both challenges within a federated transfer learning framework, aiming to enhance learning on a target data set by leveraging information from multiple heterogeneous source data sets while adhering to privacy constraints. We rigorously formulate the notion of federated differential privacy, which offers privacy guarantees for each data set without assuming a trusted central server. Under this privacy model, we study three classical statistical problems: univariate mean estimation, low-dimensional linear regression, and high-dimensional linear regression. By investigating the minimax rates and quantifying the cost of privacy in each problem, we show that federated differential privacy is an intermediate privacy model between the well-established local and central models of differential privacy. Our analyses account for data heterogeneity and privacy, highlighting the fundamental costs associated with each factor and the benefits of knowledge transfer in federated learning.