Empirical Calibration and Metric Differential Privacy in Language Models

📅 2025-03-18

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Differential privacy (DP) budget ε in NLP lacks interpretability and cross-framework comparability. Method: This paper introduces an empirical privacy calibration paradigm grounded in reconstruction attacks—departing from conventional membership inference attacks (MIAs) to directly quantify actual privacy leakage via text reconstruction. We propose a directional DP mechanism built upon the von Mises–Fisher (vMF) distribution, overcoming the limitations of isotropic Gaussian noise by enabling controlled perturbation of gradient directions; this mechanism is integrated into a DP-SGD variant and systematically evaluated on RoBERTa fine-tuning tasks. Results: Experiments demonstrate that the vMF mechanism achieves stronger targeted privacy protection with lower utility loss than baselines across text classification and generation tasks. Moreover, distinct DP mechanisms exhibit complementary strengths, establishing a reproducible, comparable empirical benchmark for privacy–utility trade-offs in NLP.

Technology Category

Application Category

📝 Abstract

NLP models trained with differential privacy (DP) usually adopt the DP-SGD framework, and privacy guarantees are often reported in terms of the privacy budget $epsilon$. However, $epsilon$ does not have any intrinsic meaning, and it is generally not possible to compare across variants of the framework. Work in image processing has therefore explored how to empirically calibrate noise across frameworks using Membership Inference Attacks (MIAs). However, this kind of calibration has not been established for NLP. In this paper, we show that MIAs offer little help in calibrating privacy, whereas reconstruction attacks are more useful. As a use case, we define a novel kind of directional privacy based on the von Mises-Fisher (VMF) distribution, a metric DP mechanism that perturbs angular distance rather than adding (isotropic) Gaussian noise, and apply this to NLP architectures. We show that, even though formal guarantees are incomparable, empirical privacy calibration reveals that each mechanism has different areas of strength with respect to utility-privacy trade-offs.

Problem

Research questions and friction points this paper is trying to address.

Empirical calibration of privacy in NLP models

Comparison of privacy mechanisms using reconstruction attacks

Introduction of directional privacy via VMF distribution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses reconstruction attacks for privacy calibration

Introduces directional privacy via VMF distribution

Applies metric DP to NLP architectures

🔎 Similar Papers

Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions