PII-Bench: Evaluating Query-Aware Privacy Protection Systems

📅 2025-02-25

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This work addresses the privacy leakage risk of personally identifiable information (PII) embedded in user prompts for large language models (LLMs). We propose the first query-aware PII privacy protection evaluation framework. Methodologically, we design a query-agnostic PII masking strategy and construct a fine-grained (55 categories), multi-scenario (single- and multi-subject interaction) standardized benchmark; it is built upon 2,842 manually curated samples and integrates contextual modeling, query intent alignment, and ground-truth answer annotation to enable end-to-end assessment. Our key contribution is the first deep coupling of PII masking with query relevance judgment—revealing that mainstream LLMs exhibit severe deficiencies in relevance identification within multi-subject interactions, thereby identifying intelligent selective masking as a critical bottleneck in practical PII protection.

Technology Category

Application Category

📝 Abstract

The widespread adoption of Large Language Models (LLMs) has raised significant privacy concerns regarding the exposure of personally identifiable information (PII) in user prompts. To address this challenge, we propose a query-unrelated PII masking strategy and introduce PII-Bench, the first comprehensive evaluation framework for assessing privacy protection systems. PII-Bench comprises 2,842 test samples across 55 fine-grained PII categories, featuring diverse scenarios from single-subject descriptions to complex multi-party interactions. Each sample is carefully crafted with a user query, context description, and standard answer indicating query-relevant PII. Our empirical evaluation reveals that while current models perform adequately in basic PII detection, they show significant limitations in determining PII query relevance. Even state-of-the-art LLMs struggle with this task, particularly in handling complex multi-subject scenarios, indicating substantial room for improvement in achieving intelligent PII masking.

Problem

Research questions and friction points this paper is trying to address.

Develops PII-Bench for privacy protection evaluation

Proposes query-unrelated PII masking strategy

Highlights limitations in PII query relevance detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Query-unrelated PII masking strategy

PII-Bench evaluation framework

Complex multi-party interaction scenarios

🔎 Similar Papers

PII-Compass: Guiding LLM training data extraction prompts towards the target PII via grounding