🤖 AI Summary
Addressing the challenge of balancing privacy protection strength and semantic fidelity in text sanitization, this paper proposes an adaptive redaction method inspired by differential privacy and grounded in semantic sensitivity modeling. The method achieves the first Pareto-optimal trade-off between privacy gain and redaction extent, leveraging a context-aware masking generation mechanism to dynamically identify and minimize deletion of semantically critical information. Experimental evaluation across multiple benchmark datasets demonstrates that, under identical privacy budgets, the proposed approach improves privacy protection strength by 32% and reduces average redaction volume by 41% compared to current state-of-the-art methods. These gains significantly enhance both information utility and security compliance in regulated text processing scenarios.
📝 Abstract
We propose a novel redaction methodology that can be used to sanitize natural text data. Our new technique provides better privacy benefits than other state of the art techniques while maintaining lower redaction levels.