A Hassle-free Algorithm for Private Learning in Practice: Don't Use Tree Aggregation, Use BLTs

📅 2024-08-16
🏛️ arXiv.org
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
To address the bottleneck in privacy–utility trade-offs and system overhead of differential privacy (DP) mechanisms for on-device keyboard applications in federated learning (FL), this work introduces the first multi-party extension of the Buffered Linear Toeplitz (BLT) mechanism within a DP-FTRL optimization framework, explicitly modeling multi-round client participation. Compared to tree aggregation and the matrix mechanism, BLT achieves low deployment complexity, near-optimal privacy–utility trade-offs, and substantial resource savings. Evaluated on the StackOverflow benchmark and a production FL system, BLT improves average accuracy by 1.2–2.7% across four on-device language modeling tasks, reduces privacy budget consumption by 38%, cuts peak memory usage by 52%, and accelerates convergence—outperforming tree aggregation and matching the matrix mechanism’s convergence speed.

Technology Category

Application Category

📝 Abstract
The state-of-the-art for training on-device language models for mobile keyboard applications combines federated learning (FL) with differential privacy (DP) via the DP-Follow-the-Regularized-Leader (DP-FTRL) algorithm. Two variants of DP-FTRL are used in practice, tree aggregation and matrix factorization. However, tree aggregation suffers from significantly suboptimal privacy/utility tradeoffs, while matrix mechanisms require expensive optimization parameterized by hard-to-estimate-in-advance constants, and high runtime memory costs.This paper extends the recently introduced Buffered Linear Toeplitz (BLT) mechanism to multi-participation scenarios. Our BLT-DP-FTRL maintains the ease-of-use advantages of tree aggregation, while essentially matching matrix factorization in terms of utility and privacy. We evaluate BLT-DP-FTRL on the StackOverflow dataset, serving as a re-producible simulation benchmark, and across four on-device language model tasks in a production FL system. Our empirical results highlight the advantages of the BLT mechanism and elevate the practicality and effectiveness of DP in real-world scenarios.
Problem

Research questions and friction points this paper is trying to address.

Federated Learning
Differential Privacy
Tree Aggregation and Matrix Factorization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Buffered Linear Toeplitz
Differential Privacy
Federated Learning
🔎 Similar Papers
No similar papers found.
H
H. B. McMahan
Google
Z
Zheng Xu
Google
Yanxiang Zhang
Yanxiang Zhang
Google
Deep LearningFederated LearningNLP