🤖 AI Summary
Zero- and few-shot detection of AI-generated images suffers from theoretical underpinning gaps and performance bottlenecks due to reliance on forged training data. Method: This paper proposes a novel paradigm that requires no synthetic training images. It theoretically models intrinsic biases in generative content by characterizing the curvature and gradient structure of the implicit probability manifold. We further design a suite of techniques grounded in pre-trained diffusion models: (i) score-based curvature approximation, (ii) gradient bias quantification, and (iii) a Mixture-of-Experts (MoE) adaptation mechanism. Results: Our method achieves state-of-the-art performance across 20 mainstream generative models, significantly outperforming prior zero- and few-shot detectors in accuracy. It demonstrates strong generalization across unseen generators, rigorous theoretical interpretability, and practical deployability—bridging foundational analysis with real-world applicability.
📝 Abstract
Distinguishing between real and AI-generated images, commonly referred to as 'image detection', presents a timely and significant challenge. Despite extensive research in the (semi-)supervised regime, zero-shot and few-shot solutions have only recently emerged as promising alternatives. Their main advantage is in alleviating the ongoing data maintenance, which quickly becomes outdated due to advances in generative technologies. We identify two main gaps: (1) a lack of theoretical grounding for the methods, and (2) significant room for performance improvements in zero-shot and few-shot regimes. Our approach is founded on understanding and quantifying the biases inherent in generated content, where we use these quantities as criteria for characterizing generated images. Specifically, we explore the biases of the implicit probability manifold, captured by a pre-trained diffusion model. Through score-function analysis, we approximate the curvature, gradient, and bias towards points on the probability manifold, establishing criteria for detection in the zero-shot regime. We further extend our contribution to the few-shot setting by employing a mixture-of-experts methodology. Empirical results across 20 generative models demonstrate that our method outperforms current approaches in both zero-shot and few-shot settings. This work advances the theoretical understanding and practical usage of generated content biases through the lens of manifold analysis.