🤖 AI Summary
This work addresses nonconvex and strongly convex bilevel optimization under differential privacy (DP) constraints, proposing the first Hessian-free private hypergradient algorithm. Methodologically, it integrates stochastic gradient estimation, noise injection, and DP mechanisms within a unified framework supporting constrained/unconstrained, batch/stochastic, and empirical/population loss settings. Theoretical contributions include: (1) the first bilevel optimization algorithm provably satisfying $(varepsilon,delta)$-DP; (2) a hypergradient estimation error bound of $ ilde{mathcal{O}}ig((sqrt{d_{ ext{up}}}/(varepsilon n))^{1/2} + (sqrt{d_{ ext{low}}}/(varepsilon n))^{1/3}ig)$, where $d_{ ext{up}}$ and $d_{ ext{low}}$ denote upper- and lower-level parameter dimensions, respectively; and (3) a provably private regularization hyperparameter tuning rule. This is the first work to establish efficient, scalable, and theoretically rigorous privacy guarantees for bilevel learning under standard DP.
📝 Abstract
We present differentially private (DP) algorithms for bilevel optimization, a problem class that received significant attention lately in various machine learning applications. These are the first algorithms for such problems under standard DP constraints, and are also the first to avoid Hessian computations which are prohibitive in large-scale settings. Under the well-studied setting in which the upper-level is not necessarily convex and the lower-level problem is strongly-convex, our proposed gradient-based $(epsilon,delta)$-DP algorithm returns a point with hypergradient norm at most $widetilde{mathcal{O}}left((sqrt{d_mathrm{up}}/epsilon n)^{1/2}+(sqrt{d_mathrm{low}}/epsilon n)^{1/3}
ight)$ where $n$ is the dataset size, and $d_mathrm{up}/d_mathrm{low}$ are the upper/lower level dimensions. Our analysis covers constrained and unconstrained problems alike, accounts for mini-batch gradients, and applies to both empirical and population losses. As an application, we specialize our analysis to derive a simple private rule for tuning a regularization hyperparameter.