A Hassle-free Algorithm for Private Learning in Practice: Don't Use Tree Aggregation, Use BLTs

📅 2024-08-16

🏛️ arXiv.org

📈 Citations: 4

✨ Influential: 0

career value

202K/year

🤖 AI Summary

To address the bottleneck in privacy–utility trade-offs and system overhead of differential privacy (DP) mechanisms for on-device keyboard applications in federated learning (FL), this work introduces the first multi-party extension of the Buffered Linear Toeplitz (BLT) mechanism within a DP-FTRL optimization framework, explicitly modeling multi-round client participation. Compared to tree aggregation and the matrix mechanism, BLT achieves low deployment complexity, near-optimal privacy–utility trade-offs, and substantial resource savings. Evaluated on the StackOverflow benchmark and a production FL system, BLT improves average accuracy by 1.2–2.7% across four on-device language modeling tasks, reduces privacy budget consumption by 38%, cuts peak memory usage by 52%, and accelerates convergence—outperforming tree aggregation and matching the matrix mechanism’s convergence speed.

Technology Category

Application Category

📝 Abstract

The state-of-the-art for training on-device language models for mobile keyboard applications combines federated learning (FL) with differential privacy (DP) via the DP-Follow-the-Regularized-Leader (DP-FTRL) algorithm. Two variants of DP-FTRL are used in practice, tree aggregation and matrix factorization. However, tree aggregation suffers from significantly suboptimal privacy/utility tradeoffs, while matrix mechanisms require expensive optimization parameterized by hard-to-estimate-in-advance constants, and high runtime memory costs.This paper extends the recently introduced Buffered Linear Toeplitz (BLT) mechanism to multi-participation scenarios. Our BLT-DP-FTRL maintains the ease-of-use advantages of tree aggregation, while essentially matching matrix factorization in terms of utility and privacy. We evaluate BLT-DP-FTRL on the StackOverflow dataset, serving as a re-producible simulation benchmark, and across four on-device language model tasks in a production FL system. Our empirical results highlight the advantages of the BLT mechanism and elevate the practicality and effectiveness of DP in real-world scenarios.

Problem

Research questions and friction points this paper is trying to address.

Federated Learning

Differential Privacy

Tree Aggregation and Matrix Factorization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Buffered Linear Toeplitz

Differential Privacy

Federated Learning

🔎 Similar Papers

No similar papers found.