Beyond Semantic Features: Pixel-level Mapping for Generalized AI-Generated Image Detection

📅 2025-12-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
AI-generated image detectors suffer from poor generalization, primarily due to over-reliance on model-specific semantic cues rather than universal generative artifacts. Method: We propose an unsupervised pixel-wise nonlinear mapping preprocessing technique that actively perturbs pixel distributions to eliminate semantic shortcuts, thereby forcing detectors to attend to high-frequency domain artifacts common across diverse generators. This approach explicitly integrates semantic shortcut suppression into the robustness enhancement framework and combines frequency-domain analysis with fine-tuning of ResNet- and ViT-based detectors. Contribution/Results: To our knowledge, this is the first work to explicitly incorporate semantic shortcut mitigation into detection robustness design. Our method enables cross-generator evaluation between GANs and diffusion models. Experiments demonstrate a 12.7% average improvement in zero-shot detection accuracy on unseen generators, significantly enhancing the generalization capability of state-of-the-art detectors without increasing inference overhead.

Technology Category

Application Category

📝 Abstract
The rapid evolution of generative technologies necessitates reliable methods for detecting AI-generated images. A critical limitation of current detectors is their failure to generalize to images from unseen generative models, as they often overfit to source-specific semantic cues rather than learning universal generative artifacts. To overcome this, we introduce a simple yet remarkably effective pixel-level mapping pre-processing step to disrupt the pixel value distribution of images and break the fragile, non-essential semantic patterns that detectors commonly exploit as shortcuts. This forces the detector to focus on more fundamental and generalizable high-frequency traces inherent to the image generation process. Through comprehensive experiments on GAN and diffusion-based generators, we show that our approach significantly boosts the cross-generator performance of state-of-the-art detectors. Extensive analysis further verifies our hypothesis that the disruption of semantic cues is the key to generalization.
Problem

Research questions and friction points this paper is trying to address.

Detects AI-generated images across unseen generative models.
Disrupts pixel distribution to break semantic shortcut patterns.
Forces focus on generalizable high-frequency generative artifacts.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pixel-level mapping disrupts image distribution
Forces focus on high-frequency generative traces
Enhances cross-generator detection generalization
🔎 Similar Papers
No similar papers found.
C
Chenming Zhou
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Jiaan Wang
Jiaan Wang
WeChat AI, Tencent
Natural Language ProcessingMachine TranslationInformation Systems
Y
Yu Li
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
L
Lei Li
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Juan Cao
Juan Cao
Professor of Mathematics, Xiamen University
Computer Aided Geometric DesignComputer Graphics
Sheng Tang
Sheng Tang
Institute of Computing Technology, Chinese Academy of Sciences
computer visionpattern recognitionmachine learningimage/video processing