🤖 AI Summary
AI-generated image detection suffers from poor generalization to unseen generative models. To address this, we propose an image-adaptive prompt learning framework that dynamically generates input-dependent prompts via conditional information modeling and confidence-driven prompt synthesis—departing from conventional static prompt paradigms. Our method integrates CNN-based feature extraction, a gating mechanism, and learnable token embeddings; it fine-tunes only shallow parameters of a pre-trained foundation model and optimizes prompts using a single input sample. Evaluated on UniversalFakeDetect and GenImage benchmarks, our approach achieves mean accuracies of 95.61% and 96.7%, respectively—outperforming state-of-the-art methods by a substantial margin. This work establishes a novel paradigm for cross-generator generalization in AI-generated image detection, enabling robust, adaptive, and parameter-efficient detection without requiring retraining for new generators.
📝 Abstract
A major struggle for AI-generated image detection is identifying fake images from unseen generators. Existing cutting-edge methods typically customize pre-trained foundation models to this task via partial-parameter fine-tuning. However, these parameters trained on a narrow range of generators may fail to generalize to unknown sources. In light of this, we propose a novel framework named Image-Adaptive Prompt Learning (IAPL), which enhances flexibility in processing diverse testing images. It consists of two adaptive modules, i.e., the Conditional Information Learner and the Confidence-Driven Adaptive Prediction. The former employs CNN-based feature extractors to learn forgery-specific and image-specific conditions, which are then propagated to learnable tokens via a gated mechanism. The latter optimizes the shallowest learnable tokens based on a single test sample and selects the cropped view with the highest prediction confidence for final detection. These two modules enable the prompts fed into the foundation model to be automatically adjusted based on the input image, rather than being fixed after training, thereby enhancing the model's adaptability to various forged images. Extensive experiments show that IAPL achieves state-of-the-art performance, with 95.61% and 96.7% mean accuracy on two widely used UniversalFakeDetect and GenImage datasets, respectively.