LeakyCLIP: Extracting Training Data from CLIP

📅 2025-08-01

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This work exposes privacy leakage risks in CLIP models stemming from training data memorization, and introduces LeakyCLIP—the first systematic attack framework for reconstructing training images from text prompts. To address three key challenges in CLIP inversion—non-robust features, semantic incompleteness, and low reconstruction fidelity—the framework integrates adversarial fine-tuning, linear embedding alignment, and Stable Diffusion–driven refinement. Experiments on ViT-B-16 show a 358% improvement in SSIM over baseline methods. Crucially, even low-fidelity reconstructions enable effective training-data membership inference. Large-scale evaluation on a LAION-2B subset confirms pervasive data extraction vulnerabilities across mainstream CLIP models. This study establishes a new paradigm and empirical foundation for privacy research in multimodal foundation models.

Technology Category

Application Category

📝 Abstract

Understanding the memorization and privacy leakage risks in Contrastive Language--Image Pretraining (CLIP) is critical for ensuring the security of multimodal models. Recent studies have demonstrated the feasibility of extracting sensitive training examples from diffusion models, with conditional diffusion models exhibiting a stronger tendency to memorize and leak information. In this work, we investigate data memorization and extraction risks in CLIP through the lens of CLIP inversion, a process that aims to reconstruct training images from text prompts. To this end, we introduce extbf{LeakyCLIP}, a novel attack framework designed to achieve high-quality, semantically accurate image reconstruction from CLIP embeddings. We identify three key challenges in CLIP inversion: 1) non-robust features, 2) limited visual semantics in text embeddings, and 3) low reconstruction fidelity. To address these challenges, LeakyCLIP employs 1) adversarial fine-tuning to enhance optimization smoothness, 2) linear transformation-based embedding alignment, and 3) Stable Diffusion-based refinement to improve fidelity. Empirical results demonstrate the superiority of LeakyCLIP, achieving over 358% improvement in Structural Similarity Index Measure (SSIM) for ViT-B-16 compared to baseline methods on LAION-2B subset. Furthermore, we uncover a pervasive leakage risk, showing that training data membership can even be successfully inferred from the metrics of low-fidelity reconstructions. Our work introduces a practical method for CLIP inversion while offering novel insights into the nature and scope of privacy risks in multimodal models.

Problem

Research questions and friction points this paper is trying to address.

Investigates data memorization risks in CLIP models

Addresses challenges in reconstructing images from CLIP embeddings

Proposes LeakyCLIP to improve reconstruction fidelity and privacy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adversarial fine-tuning enhances optimization smoothness

Linear transformation aligns embeddings for better accuracy

Stable Diffusion refines images to improve fidelity

🔎 Similar Papers

No similar papers found.