🤖 AI Summary
This work addresses the privacy risks posed by sensitive content in large-scale image datasets by proposing a two-stage automated anonymization framework. First, a vision-language model identifies privacy-sensitive regions and generates paired private/public textual descriptions along with structured editing instructions. Subsequently, an instruction-driven diffusion editor precisely rewrites sensitive visual content to preserve semantic integrity while ensuring privacy. This study is the first to integrate multimodal guidance with structured instructions for controllable anonymization and introduces a unified evaluation framework encompassing privacy preservation, fidelity, and downstream task utility. Experimental results demonstrate that the method significantly reduces facial similarity, textual identifiability, and demographic predictability, while maintaining downstream task performance comparable to that of the original data.
📝 Abstract
Large-scale image datasets frequently contain identifiable or sensitive content, raising privacy risks when training models that may memorize and leak such information. We present Unsafe2Safe, a fully automated pipeline that detects privacy-prone images and rewrites only their sensitive regions using multimodally guided diffusion editing. Unsafe2Safe operates in two stages. Stage 1 uses a vision-language model to (i) inspect images for privacy risks, (ii) generate paired private and public captions that respectively include and omit sensitive attributes, and (iii) prompt a large language model to produce structured, identity-neutral edit instructions conditioned on the public caption. Stage 2 employs instruction-driven diffusion editors to apply these dual textual prompts, producing privacy-safe images that preserve global structure and task-relevant semantics while neutralizing private content. To measure anonymization quality, we introduce a unified evaluation suite covering Quality, Cheating, Privacy, and Utility dimensions. Across MS-COCO, Caltech101, and MIT Indoor67, Unsafe2Safe reduces face similarity, text similarity, and demographic predictability by large margins, while maintaining downstream model accuracy comparable to training on raw data. Fine-tuning diffusion editors on our automatically generated triplets (private caption, public caption, edit instruction) further improves both privacy protection and semantic fidelity. Unsafe2Safe provides a scalable, principled solution for constructing large, privacy-safe datasets without sacrificing visual consistency or downstream utility.