🤖 AI Summary
This work addresses the growing challenge of detecting increasingly photorealistic AI-generated images, particularly in zero-shot settings that require no training and must generalize to unseen data. The authors propose a novel detection method based on structured perturbations in the frequency domain, which uniquely integrates single-step Fourier-domain perturbation with representational sensitivity analysis. By quantifying the differential response of vision models to such perturbations, the approach effectively captures subtle discrepancies between real and synthetic images. The resulting zero-shot detection framework incurs minimal computational overhead and achieves state-of-the-art performance on benchmarks such as OpenFake, improving AUC by nearly 10% while accelerating inference by one to two orders of magnitude compared to existing methods.
📝 Abstract
The rapid progress of text-to-image models has made AI-generated images increasingly realistic, posing significant challenges for accurate detection of generated content. While training-based detectors often suffer from limited generalization to unseen images, training-free approaches offer better robustness, yet struggle to capture subtle discrepancies between real and synthetic images. In this work, we propose a training-free AI-generated image detection method that measures representation sensitivity to structured frequency perturbations, enabling detection of minute manipulations. The proposed method is computationally lightweight, as perturbation generation requires only a single Fourier transform for an input image. As a result, it achieves one to two orders of magnitude faster inference than most training-free detectors.Extensive experiments on challenging benchmarks demonstrate the efficacy of our method over state-of-the-art (SoTA). In particular, on OpenFake benchmark, our method improves AUC by nearly $10\%$ compared to SoTA, while maintaining substantially lower computational cost.