Adaptive Text Anonymization: Learning Privacy-Utility Trade-offs via Prompt Optimization

📅 2026-02-24

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

Existing text anonymization methods predominantly rely on static strategies, struggling to dynamically balance privacy preservation with data utility and failing to accommodate the diverse requirements across domains, privacy objectives, and downstream tasks. This work proposes the first framework that formulates anonymization as a learnable, adaptive task, leveraging task-specific prompts to automatically generate context-aware anonymization instructions that guide large language models in flexibly adjusting their anonymization strategies. Built upon open-source language models, the approach is both computationally efficient and highly scalable. Evaluated on a benchmark spanning five distinct domains, the method consistently outperforms existing baselines, achieving performance comparable to that of large closed-source models while significantly improving computational efficiency.

Technology Category

Application Category

📝 Abstract

Anonymizing textual documents is a highly context-sensitive problem: the appropriate balance between privacy protection and utility preservation varies with the data domain, privacy objectives, and downstream application. However, existing anonymization methods rely on static, manually designed strategies that lack the flexibility to adjust to diverse requirements and often fail to generalize across domains. We introduce adaptive text anonymization, a new task formulation in which anonymization strategies are automatically adapted to specific privacy-utility requirements. We propose a framework for task-specific prompt optimization that automatically constructs anonymization instructions for language models, enabling adaptation to different privacy goals, domains, and downstream usage patterns. To evaluate our approach, we present a benchmark spanning five datasets with diverse domains, privacy constraints, and utility objectives. Across all evaluated settings, our framework consistently achieves a better privacy-utility trade-off than existing baselines, while remaining computationally efficient and effective on open-source language models, with performance comparable to larger closed-source models. Additionally, we show that our method can discover novel anonymization strategies that explore different points along the privacy-utility trade-off frontier.

Problem

Research questions and friction points this paper is trying to address.

text anonymization

privacy-utility trade-off

adaptive anonymization

context-sensitive privacy

domain generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

adaptive text anonymization

prompt optimization

privacy-utility trade-off