OmniAID: Decoupling Semantic and Artifacts for Universal AI-Generated Image Detection in the Wild

📅 2025-11-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current AI-generated image (AIGI) detectors suffer from poor generalization and conflate content-related artifacts (e.g., semantic inconsistencies) with content-agnostic distortions (e.g., high-frequency noise). To address this, we propose a decoupled universal detection framework. Methodologically, we design a Mixture-of-Experts (MoE) architecture: a fixed universal artifact expert captures cross-model, content-agnostic distortions, while routable domain-specific experts model semantic-aware generation errors; training employs hard sampling and a lightweight gating network in two stages. Our key contribution is the first explicit disentanglement of these two defect categories, significantly improving robustness across diverse generative models and semantic content. Evaluated on standard benchmarks and the newly constructed large-scale photorealistic dataset Mirage, our method achieves state-of-the-art performance, reducing false positive rates by 12.7% in complex scenarios.

Technology Category

Application Category

📝 Abstract
A truly universal AI-Generated Image (AIGI) detector must simultaneously generalize across diverse generative models and varied semantic content. Current state-of-the-art methods learn a single, entangled forgery representation--conflating content-dependent flaws with content-agnostic artifacts--and are further constrained by outdated benchmarks. To overcome these limitations, we propose OmniAID, a novel framework centered on a decoupled Mixture-of-Experts (MoE) architecture. The core of our method is a hybrid expert system engineered to decouple: (1) semantic flaws across distinct content domains, and (2) these content-dependent flaws from content-agnostic universal artifacts. This system employs a set of Routable Specialized Semantic Experts, each for a distinct domain (e.g., human, animal), complemented by a Fixed Universal Artifact Expert. This architecture is trained using a bespoke two-stage strategy: we first train the experts independently with domain-specific hard-sampling to ensure specialization, and subsequently train a lightweight gating network for effective input routing. By explicitly decoupling"what is generated"(content-specific flaws) from"how it is generated"(universal artifacts), OmniAID achieves robust generalization. To address outdated benchmarks and validate real-world applicability, we introduce Mirage, a new large-scale, contemporary dataset. Extensive experiments, using both traditional benchmarks and our Mirage dataset, demonstrate our model surpasses existing monolithic detectors, establishing a new, robust standard for AIGI authentication against modern, in-the-wild threats.
Problem

Research questions and friction points this paper is trying to address.

Decouples semantic flaws from universal artifacts for AI image detection
Generalizes across diverse generative models and varied semantic content
Overcomes limitations of entangled representations and outdated benchmarks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decouples semantic flaws from universal artifacts
Employs routable specialized and fixed universal experts
Uses two-stage training with domain-specific hard-sampling
🔎 Similar Papers
No similar papers found.
Y
Yuncheng Guo
Shanghai Artificial Intelligence Laboratory
Junyan Ye
Junyan Ye
SYSU
Computer Vision and Deep Learning
C
Chenjue Zhang
Tsinghua University
H
Hengrui Kang
Shanghai Jiao Tong University
Haohuan Fu
Haohuan Fu
Tsinghua University
Conghui He
Conghui He
Shanghai AI Laboratory
Data-centric AILLMDocument Intelligence
W
Weijia Li
Sun Yat-Sen University