🤖 AI Summary
To address the risk of sensitive personally identifiable information (PII) leakage in LLM-based dialogues, this paper proposes LOPSIDED—a semantic-aware local privacy proxy. The framework introduces a novel semantic-consistency pseudonymization mechanism that dynamically identifies and replaces only context-critical identifiable entities, performing real-time pseudonymization locally before response generation and automatic depseudonymization afterward—thereby preserving both semantic integrity and privacy. Technically, it integrates named entity recognition, semantic role labeling, and context-aware relevance analysis to avoid the semantic distortion and information loss inherent in conventional generalization-based anonymization. Experiments on real-world ShareGPT dialogue data demonstrate that LOPSIDED reduces semantic utility errors by 5× compared to baseline methods while achieving zero sensitive-information leakage, significantly improving the privacy–utility trade-off.
📝 Abstract
With the increasing use of conversational AI systems, there is growing concern over privacy leaks, especially when users share sensitive personal data in interactions with Large Language Models (LLMs). Conversations shared with these models may contain Personally Identifiable Information (PII), which, if exposed, could lead to security breaches or identity theft. To address this challenge, we present the Local Optimizations for Pseudonymization with Semantic Integrity Directed Entity Detection (LOPSIDED) framework, a semantically-aware privacy agent designed to safeguard sensitive PII data when using remote LLMs. Unlike prior work that often degrade response quality, our approach dynamically replaces sensitive PII entities in user prompts with semantically consistent pseudonyms, preserving the contextual integrity of conversations. Once the model generates its response, the pseudonyms are automatically depseudonymized, ensuring the user receives an accurate, privacy-preserving output. We evaluate our approach using real-world conversations sourced from ShareGPT, which we further augment and annotate to assess whether named entities are contextually relevant to the model's response. Our results show that LOPSIDED reduces semantic utility errors by a factor of 5 compared to baseline techniques, all while enhancing privacy.