🤖 AI Summary
This paper investigates whether zeroth-order projected gradient descent (ZO-PGD) inherently satisfies differential privacy—i.e., solely through its intrinsic randomness (e.g., random direction sampling)—without requiring explicit noise injection.
Method: We develop a query-based optimization analysis framework to rigorously characterize the cumulative privacy loss of zeroth-order gradient estimation under strongly convex and general convex objectives.
Contribution/Results: We prove that privacy loss grows superlinearly with iteration count—even under random initialization—and that, under fixed initialization, ZO-PGD fails to satisfy differential privacy for certain strongly convex functions. Crucially, privacy risk persists even when intermediate iterates are hidden. This work provides the first systematic theoretical refutation of the widely held belief in the universal inherent privacy guarantees of zeroth-order methods, delivering critical theoretical guidance for algorithm selection in private fine-tuning.
📝 Abstract
Differentially private zeroth-order optimization methods have recently gained popularity in private fine tuning of machine learning models due to their reduced memory requirements. Current approaches for privatizing zeroth-order methods rely on adding Gaussian noise to the estimated zeroth-order gradients. However, since the search direction in the zeroth-order methods is inherently random, researchers including Tang et al. (2024) and Zhang et al. (2024a) have raised an important question: is the inherent noise in zeroth-order estimators sufficient to ensure the overall differential privacy of the algorithm? This work settles this question for a class of oracle-based optimization algorithms where the oracle returns zeroth-order gradient estimates. In particular, we show that for a fixed initialization, there exist strongly convex objective functions such that running (Projected) Zeroth-Order Gradient Descent (ZO-GD) is not differentially private. Furthermore, we show that even with random initialization and without revealing (initial and) intermediate iterates, the privacy loss in ZO-GD can grow superlinearly with the number of iterations when minimizing convex objective functions.