Empirical Calibration and Metric Differential Privacy in Language Models

📅 2025-03-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Differential privacy (DP) budget ε in NLP lacks interpretability and cross-framework comparability. Method: This paper introduces an empirical privacy calibration paradigm grounded in reconstruction attacks—departing from conventional membership inference attacks (MIAs) to directly quantify actual privacy leakage via text reconstruction. We propose a directional DP mechanism built upon the von Mises–Fisher (vMF) distribution, overcoming the limitations of isotropic Gaussian noise by enabling controlled perturbation of gradient directions; this mechanism is integrated into a DP-SGD variant and systematically evaluated on RoBERTa fine-tuning tasks. Results: Experiments demonstrate that the vMF mechanism achieves stronger targeted privacy protection with lower utility loss than baselines across text classification and generation tasks. Moreover, distinct DP mechanisms exhibit complementary strengths, establishing a reproducible, comparable empirical benchmark for privacy–utility trade-offs in NLP.

Technology Category

Application Category

📝 Abstract
NLP models trained with differential privacy (DP) usually adopt the DP-SGD framework, and privacy guarantees are often reported in terms of the privacy budget $epsilon$. However, $epsilon$ does not have any intrinsic meaning, and it is generally not possible to compare across variants of the framework. Work in image processing has therefore explored how to empirically calibrate noise across frameworks using Membership Inference Attacks (MIAs). However, this kind of calibration has not been established for NLP. In this paper, we show that MIAs offer little help in calibrating privacy, whereas reconstruction attacks are more useful. As a use case, we define a novel kind of directional privacy based on the von Mises-Fisher (VMF) distribution, a metric DP mechanism that perturbs angular distance rather than adding (isotropic) Gaussian noise, and apply this to NLP architectures. We show that, even though formal guarantees are incomparable, empirical privacy calibration reveals that each mechanism has different areas of strength with respect to utility-privacy trade-offs.
Problem

Research questions and friction points this paper is trying to address.

Empirical calibration of privacy in NLP models
Comparison of privacy mechanisms using reconstruction attacks
Introduction of directional privacy via VMF distribution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses reconstruction attacks for privacy calibration
Introduces directional privacy via VMF distribution
Applies metric DP to NLP architectures
🔎 Similar Papers
No similar papers found.