🤖 AI Summary
In differentially private (DP) machine learning, publicly verifying whether a trained model satisfies formal privacy guarantees remains prohibitively expensive—existing verification costs are comparable to training itself, severely hindering auditability. Method: We propose the first DP training algorithm enabling low-cost public verification. Building upon stochastic convex optimization, we privatize a sequence of regularized objective functions and construct verifiable DP certificates using standard composition bounds. Contribution/Results: Our approach achieves near-optimal privacy–utility trade-offs while reducing public verification complexity substantially below that of training. Experiments on large-scale datasets demonstrate over an order-of-magnitude reduction in verification overhead, with tight, practical privacy loss bounds. This breaks the longstanding coupling between verification and training computational costs—a fundamental bottleneck in DP auditing.
📝 Abstract
Training with differential privacy (DP) provides a guarantee to members in a dataset that they cannot be identified by users of the released model. However, those data providers, and, in general, the public, lack methods to efficiently verify that models trained on their data satisfy DP guarantees. The amount of compute needed to verify DP guarantees for current algorithms scales with the amount of compute required to train the model. In this paper we design the first DP algorithm with near optimal privacy-utility trade-offs but whose DP guarantees can be verified cheaper than training. We focus on DP stochastic convex optimization (DP-SCO), where optimal privacy-utility trade-offs are known. Here we show we can obtain tight privacy-utility trade-offs by privately minimizing a series of regularized objectives and only using the standard DP composition bound. Crucially, this method can be verified with much less compute than training. This leads to the first known DP-SCO algorithm with near optimal privacy-utility whose DP verification scales better than training cost, significantly reducing verification costs on large datasets.