Investigating Privacy Bias in Training Data of Language Models

📅 2024-09-05

📈 Citations: 1

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This study investigates implicit privacy bias in large language model (LLM) training data—systematic deviations in models’ judgments of information flow appropriateness across social contexts. We propose the first theoretically grounded evaluation framework for privacy bias, built upon Contextual Integrity theory. Our method employs cross-model response comparison, prompt robustness control, and statistical bias detection to isolate privacy-specific biases while mitigating confounding effects from prompt variation. Experiments across mainstream LLMs reveal significant and inconsistent privacy judgment biases, indicating a fundamental lack of systematic modeling of privacy norms in training data. Key contributions include: (1) the first theory-driven privacy bias assessment paradigm; (2) empirical evidence that improved model capability and optimization may exacerbate—not alleviate—privacy judgment inaccuracies; and (3) an interpretable, empirically validated evaluation tool to advance trustworthy AI governance.

Technology Category

Application Category

📝 Abstract

As LLMs are integrated into sociotechnical systems, it is crucial to examine the privacy biases they exhibit. A privacy bias refers to the skew in the appropriateness of information flows within a given context that LLMs acquire from large amounts of non-publicly available training data. This skew may either align with existing expectations or signal a symptom of systemic issues reflected in the training datasets. We formulate a novel research question: how can we examine privacy biases in the training data of LLMs? We present a novel approach to assess the privacy biases using a contextual integrity-based methodology to evaluate the responses from different LLMs. Our approach accounts for the sensitivity of responses across prompt variations, which hinders the evaluation of privacy biases. We investigate how privacy biases are affected by model capacities and optimizations.

Problem

Research questions and friction points this paper is trying to address.

Examine privacy biases in LLM training data

Assess biases using contextual integrity methodology

Investigate impact of model capacities on biases

Innovation

Methods, ideas, or system contributions that make the work stand out.

Contextual integrity-based methodology

Assess privacy biases

Model capacities and optimizations

🔎 Similar Papers

Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions