🤖 AI Summary
Current AI-generated image detection methods suffer from poor generalization to unseen generators, content-irrelevant biases, and limited robustness against JPEG compression and other degradations. Moreover, prevailing benchmarks exhibit low image quality, poor content consistency, and insufficient class diversity. To address these issues, we propose PatchShuffle—a multi-level feature shuffling mechanism that fuses high- and low-level semantic representations to enhance discriminative robustness. We further introduce TwinSynths, a high-fidelity synthetic benchmark generation framework leveraging a twin-structured design to ensure strict content- and pixel-level alignment between real–synthetic image pairs. Extensive experiments demonstrate that our method achieves state-of-the-art performance across images generated by GANs, diffusion models, and TwinSynths. Notably, it improves cross-generator generalization by 12.6% and JPEG compression robustness by 23.4% compared to prior approaches.
📝 Abstract
Identifying AI-generated content is critical for the safe and ethical use of generative AI. Recent research has focused on developing detectors that generalize to unknown generators, with popular methods relying either on high-level features or low-level fingerprints. However, these methods have clear limitations: biased towards unseen content, or vulnerable to common image degradations, such as JPEG compression. To address these issues, we propose a novel approach, SFLD, which incorporates PatchShuffle to integrate high-level semantic and low-level textural information. SFLD applies PatchShuffle at multiple levels, improving robustness and generalization across various generative models. Additionally, current benchmarks face challenges such as low image quality, insufficient content preservation, and limited class diversity. In response, we introduce TwinSynths, a new benchmark generation methodology that constructs visually near-identical pairs of real and synthetic images to ensure high quality and content preservation. Our extensive experiments and analysis show that SFLD outperforms existing methods on detecting a wide variety of fake images sourced from GANs, diffusion models, and TwinSynths, demonstrating the state-of-the-art performance and generalization capabilities to novel generative models.