CLIP-Flow: A Universal Discriminator for AI-Generated Images Inspired by Anomaly Detection

📅 2025-08-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing AI-Generated Image (AII) detectors suffer from poor generalization and fail to identify images produced by unseen generative models. Method: This paper proposes a novel, unsupervised anomaly detection framework for universal AII discrimination. It leverages a pre-trained CLIP model to extract image features, employs a normalized flow to model the distribution of natural images, and synthesizes surrogate images via spectral modification—enabling training without access to real AI-generated samples. Contribution/Results: To our knowledge, this is the first work to formulate AII detection as an anomaly detection task, achieving cross-model generalization using only natural and surrogate data. Extensive experiments demonstrate that the method significantly outperforms existing universal detectors on images from diverse generative models—including GANs and diffusion models—while exhibiting strong robustness and practical deployability.

Technology Category

Application Category

📝 Abstract
With the rapid advancement of AI generative models, the visual quality of AI-generated images (AIIs) has become increasingly close to natural images, which inevitably raises security concerns. Most AII detectors often employ the conventional image classification pipeline with natural images and AIIs (generated by a generative model), which can result in limited detection performance for AIIs from unseen generative models. To solve this, we proposed a universal AI-generated image detector from the perspective of anomaly detection. Our discriminator does not need to access any AIIs and learn a generalizable representation with unsupervised learning. Specifically, we use the pre-trained CLIP encoder as the feature extractor and design a normalizing flow-like unsupervised model. Instead of AIIs, proxy images, e.g., obtained by applying a spectral modification operation on natural images, are used for training. Our models are trained by minimizing the likelihood of proxy images, optionally combined with maximizing the likelihood of natural images. Extensive experiments demonstrate the effectiveness of our method on AIIs produced by various image generators.
Problem

Research questions and friction points this paper is trying to address.

Detecting AI-generated images from unseen models
Improving detection without access to AI-generated images
Using anomaly detection for universal image discrimination
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses CLIP encoder for feature extraction
Employs normalizing flow-like unsupervised model
Trains with proxy images via spectral modification
🔎 Similar Papers
No similar papers found.