π€ AI Summary
Quantization in differentially private stochastic gradient descent (DP-SGD) amplifies quantization variance due to noise injection, causing significant accuracy degradation. To address this, we propose DPQuantβthe first noise-aware dynamic quantization scheduling framework. DPQuant guides low-precision quantization via loss sensitivity estimation and integrates probabilistic layer sampling with gradient-sensitive layer prioritization to enable adaptive per-iteration layer selection. By suppressing variance amplification under extremely tight privacy budgets, DPQuant achieves Pareto-optimal trade-offs between accuracy and computational efficiency while guaranteeing rigorous differential privacy. Experiments on ResNet and DenseNet demonstrate that DPQuant attains up to a 2.21Γ theoretical throughput gain over static quantization, with accuracy loss bounded within 2%, substantially outperforming existing baselines.
π Abstract
Differentially-Private SGD (DP-SGD) is a powerful technique to protect user privacy when using sensitive data to train neural networks. During training, converting model weights and activations into low-precision formats, i.e., quantization, can drastically reduce training times, energy consumption, and cost, and is thus a widely used technique. In this work, we demonstrate that quantization causes significantly higher accuracy degradation in DP-SGD compared to regular SGD. We observe that this is caused by noise injection in DP-SGD, which amplifies quantization variance, leading to disproportionately large accuracy degradation. To address this challenge, we present QPQuant, a dynamic quantization framework that adaptively selects a changing subset of layers to quantize at each epoch. Our method combines two key ideas that effectively reduce quantization variance: (i) probabilistic sampling of the layers that rotates which layers are quantized every epoch, and (ii) loss-aware layer prioritization, which uses a differentially private loss sensitivity estimator to identify layers that can be quantized with minimal impact on model quality. This estimator consumes a negligible fraction of the overall privacy budget, preserving DP guarantees. Empirical evaluations on ResNet18, ResNet50, and DenseNet121 across a range of datasets demonstrate that DPQuant consistently outperforms static quantization baselines, achieving near Pareto-optimal accuracy-compute trade-offs and up to 2.21x theoretical throughput improvements on low-precision hardware, with less than 2% drop in validation accuracy.