Differential Privacy for Transformer Embeddings of Text with Nonparametric Variational Information Bottleneck

📅 2026-01-05

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

Transformer embedding vectors may inadvertently leak sensitive information, posing significant privacy risks. To address this issue, this work proposes a Nonparametric Variational Differential Privacy (NVDP) framework that integrates a nonparametric variational information bottleneck layer into multi-vector Transformer embeddings. By calibrating injected noise, NVDP enforces rigorous differential privacy guarantees. The method synergistically combines Rényi divergence with Bayesian differential privacy to dynamically balance privacy preservation and model utility. Experimental evaluation on the GLUE benchmark demonstrates that NVDP effectively trades off privacy strength against task accuracy across varying noise levels, achieving both high performance and strong privacy guarantees—even under low-noise regimes.

Technology Category

Application Category

📝 Abstract

We propose a privacy-preserving method for sharing text data by sharing noisy versions of their transformer embeddings. It has been shown that hidden representations learned by deep models can encode sensitive information from the input, making it possible for adversaries to recover the input data with considerable accuracy. This problem is exacerbated in transformer embeddings because they consist of multiple vectors, one per token. To mitigate this risk, we propose Nonparametric Variational Differential Privacy (NVDP), which ensures both useful data sharing and strong privacy protection. We take a differential privacy (DP) approach, integrating a nonparametric variational information bottleneck (NVIB) layer into the transformer architecture to inject noise into its multivector embeddings and thereby hide information, and measuring privacy protection with R\'enyi Divergence (RD) and its corresponding Bayesian Differential Privacy (BDP) guarantee. Training the NVIB layer calibrates the noise level according to the utility of the downstream task. We test NVDP on the General Language Understanding Evaluation (GLUE) benchmark and show that varying the noise level gives us a useful trade-off between privacy and accuracy. With lower noise levels, our model maintains high accuracy while offering strong privacy guarantees, effectively balancing privacy and utility.

Problem

Research questions and friction points this paper is trying to address.

Differential Privacy

Transformer Embeddings

Privacy Leakage

Text Data

Sensitive Information

Innovation

Methods, ideas, or system contributions that make the work stand out.

Differential Privacy

Transformer Embeddings

Variational Information Bottleneck