Investigating Privacy Bias in Training Data of Language Models

📅 2024-09-05
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates implicit privacy bias in large language model (LLM) training data—systematic deviations in models’ judgments of information flow appropriateness across social contexts. We propose the first theoretically grounded evaluation framework for privacy bias, built upon Contextual Integrity theory. Our method employs cross-model response comparison, prompt robustness control, and statistical bias detection to isolate privacy-specific biases while mitigating confounding effects from prompt variation. Experiments across mainstream LLMs reveal significant and inconsistent privacy judgment biases, indicating a fundamental lack of systematic modeling of privacy norms in training data. Key contributions include: (1) the first theory-driven privacy bias assessment paradigm; (2) empirical evidence that improved model capability and optimization may exacerbate—not alleviate—privacy judgment inaccuracies; and (3) an interpretable, empirically validated evaluation tool to advance trustworthy AI governance.

Technology Category

Application Category

📝 Abstract
As LLMs are integrated into sociotechnical systems, it is crucial to examine the privacy biases they exhibit. A privacy bias refers to the skew in the appropriateness of information flows within a given context that LLMs acquire from large amounts of non-publicly available training data. This skew may either align with existing expectations or signal a symptom of systemic issues reflected in the training datasets. We formulate a novel research question: how can we examine privacy biases in the training data of LLMs? We present a novel approach to assess the privacy biases using a contextual integrity-based methodology to evaluate the responses from different LLMs. Our approach accounts for the sensitivity of responses across prompt variations, which hinders the evaluation of privacy biases. We investigate how privacy biases are affected by model capacities and optimizations.
Problem

Research questions and friction points this paper is trying to address.

Examine privacy biases in LLM training data
Assess biases using contextual integrity methodology
Investigate impact of model capacities on biases
Innovation

Methods, ideas, or system contributions that make the work stand out.

Contextual integrity-based methodology
Assess privacy biases
Model capacities and optimizations
Y
Yan Shvartzshnaider
York University
Vasisht Duddu
Vasisht Duddu
University of Waterloo
Trustworthy AIAI SecurityData PrivacyAI Governance
J
John Lacalamita
York University