🤖 AI Summary
To address privacy leakage risks of sensitive user information in large language model (LLM) applications deployed on cloud services, this paper proposes a zero-shot, training-data-free iterative tree search framework for text rewriting. The method dynamically identifies and replaces or removes privacy-bearing segments via structured tree search, integrating sentence-level rewriting operations with reward-model-guided progressive optimization to achieve privacy-utility trade-offs without labeled data. Its core innovations are: (i) the first integration of zero-shot learning with tree search for privacy-aware text rewriting; and (ii) a novel joint evaluation mechanism that jointly assesses privacy removal, semantic consistency, and linguistic naturalness. Extensive experiments across multiple privacy-sensitive benchmarks demonstrate that our approach significantly outperforms existing baselines in privacy elimination rate, semantic fidelity, and fluency—effectively balancing stringent privacy protection with high textual utility.
📝 Abstract
The increasing adoption of large language models (LLMs) in cloud-based services has raised significant privacy concerns, as user inputs may inadvertently expose sensitive information. Existing text anonymization and de-identification techniques, such as rule-based redaction and scrubbing, often struggle to balance privacy preservation with text naturalness and utility. In this work, we propose a zero-shot, tree-search-based iterative sentence rewriting algorithm that systematically obfuscates or deletes private information while preserving coherence, relevance, and naturalness. Our method incrementally rewrites privacy-sensitive segments through a structured search guided by a reward model, enabling dynamic exploration of the rewriting space. Experiments on privacy-sensitive datasets show that our approach significantly outperforms existing baselines, achieving a superior balance between privacy protection and utility preservation.