Differentially Private Bilevel Optimization

📅 2024-09-29
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses nonconvex and strongly convex bilevel optimization under differential privacy (DP) constraints, proposing the first Hessian-free private hypergradient algorithm. Methodologically, it integrates stochastic gradient estimation, noise injection, and DP mechanisms within a unified framework supporting constrained/unconstrained, batch/stochastic, and empirical/population loss settings. Theoretical contributions include: (1) the first bilevel optimization algorithm provably satisfying $(varepsilon,delta)$-DP; (2) a hypergradient estimation error bound of $ ilde{mathcal{O}}ig((sqrt{d_{ ext{up}}}/(varepsilon n))^{1/2} + (sqrt{d_{ ext{low}}}/(varepsilon n))^{1/3}ig)$, where $d_{ ext{up}}$ and $d_{ ext{low}}$ denote upper- and lower-level parameter dimensions, respectively; and (3) a provably private regularization hyperparameter tuning rule. This is the first work to establish efficient, scalable, and theoretically rigorous privacy guarantees for bilevel learning under standard DP.

Technology Category

Application Category

📝 Abstract
We present differentially private (DP) algorithms for bilevel optimization, a problem class that received significant attention lately in various machine learning applications. These are the first algorithms for such problems under standard DP constraints, and are also the first to avoid Hessian computations which are prohibitive in large-scale settings. Under the well-studied setting in which the upper-level is not necessarily convex and the lower-level problem is strongly-convex, our proposed gradient-based $(epsilon,delta)$-DP algorithm returns a point with hypergradient norm at most $widetilde{mathcal{O}}left((sqrt{d_mathrm{up}}/epsilon n)^{1/2}+(sqrt{d_mathrm{low}}/epsilon n)^{1/3} ight)$ where $n$ is the dataset size, and $d_mathrm{up}/d_mathrm{low}$ are the upper/lower level dimensions. Our analysis covers constrained and unconstrained problems alike, accounts for mini-batch gradients, and applies to both empirical and population losses. As an application, we specialize our analysis to derive a simple private rule for tuning a regularization hyperparameter.
Problem

Research questions and friction points this paper is trying to address.

Develop DP algorithms for bilevel optimization
Avoid Hessian computations in large-scale settings
Provide private hyperparameter tuning solutions
Innovation

Methods, ideas, or system contributions that make the work stand out.

DP algorithms for bilevel optimization
Avoids Hessian computations efficiently
Gradient-based (ε,δ)-DP algorithm
🔎 Similar Papers
No similar papers found.