🤖 AI Summary
This paper identifies a fundamental privacy vulnerability in Conditional Latent Diffusion Models (CLDMs) used for privacy-preserving data augmentation: their reliance on structured conditional signals—such as edges or depth maps—for image synthesis inadvertently leaks identity information. To expose this flaw, we propose a contrastive learning–based identification framework and a black-box inversion attack, enabling the first systematic demonstration that CLDM-augmented images violate basic privacy guarantees, including *k*-anonymity. Experiments show that adversaries can achieve high-accuracy cross-image identity re-identification on standard face recognition benchmarks using *only* the augmented images—without access to originals or model internals. Our core contributions are threefold: (1) establishing conditional signals as the primary source of identity leakage; (2) revealing CLDMs’ inherent susceptibility to black-box inversion attacks; and (3) providing both theoretical caution and practical design boundaries for developing privacy-enhancing generative models.
📝 Abstract
Latent diffusion models can be used as a powerful augmentation method to artificially extend datasets for enhanced training. To the human eye, these augmented images look very different to the originals. Previous work has suggested to use this data augmentation technique for data anonymization. However, we show that latent diffusion models that are conditioned on features like depth maps or edges to guide the diffusion process are not suitable as a privacy preserving method. We use a contrastive learning approach to train a model that can correctly identify people out of a pool of candidates. Moreover, we demonstrate that anonymization using conditioned diffusion models is susceptible to black box attacks. We attribute the success of the described methods to the conditioning of the latent diffusion model in the anonymization process. The diffusion model is instructed to produce similar edges for the anonymized images. Hence, a model can learn to recognize these patterns for identification.