🤖 AI Summary
Existing AI-generated image detection benchmarks suffer from narrow model coverage, limited image diversity (notably lacking artistic imagery), and inadequate support for end-to-end editing scenarios. To address these limitations, we introduce the first unified, general-purpose benchmark for both detection and localization of AI-generated images. Our benchmark systematically encompasses major generative paradigms—including text-to-image synthesis, image editing, inpainting, and deepfakes—and uniquely supports both photographic and artistic image domains. We propose a multi-scale evaluation framework that jointly measures detection accuracy and localization fidelity. Through comprehensive comparative experiments, we rigorously assess state-of-the-art methods across cross-model, cross-category, and complex end-to-end editing settings. Empirical results reveal significant performance degradation on artistic images and end-to-end editing tasks, highlighting critical gaps in current approaches. This work establishes a standardized, reproducible benchmark, provides an open analysis framework, and identifies key challenges to guide future research.
📝 Abstract
With the rapid proliferation of image generative models, the authenticity of digital images has become a significant concern. While existing studies have proposed various methods for detecting AI-generated content, current benchmarks are limited in their coverage of diverse generative models and image categories, often overlooking end-to-end image editing and artistic images. To address these limitations, we introduce UniAIDet, a unified and comprehensive benchmark that includes both photographic and artistic images. UniAIDet covers a wide range of generative models, including text-to-image, image-to-image, image inpainting, image editing, and deepfake models. Using UniAIDet, we conduct a comprehensive evaluation of various detection methods and answer three key research questions regarding generalization capability and the relation between detection and localization. Our benchmark and analysis provide a robust foundation for future research.