HiMix: Hierarchical Artifact-aware Mixup for Generalized Synthetic Image Detection

📅 2026-04-30

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

This work addresses the limited generalizability of existing synthetic image detection methods, which stems from narrow and biased training data distributions. To overcome this, the authors propose HiMix, a unified framework that enhances the training distribution through Mixup-driven distribution augmentation (MDA) and introduces a hierarchical artifact-aware representation (HAR) module. HAR fuses multi-level features to heighten sensitivity to low-level forgery traces. HiMix innovatively integrates pixel-level Mixup with coarse-to-fine feature aggregation, generating smoothly interpolated samples that effectively consolidate artifact information from global to local scales. Extensive experiments demonstrate that HiMix substantially outperforms current state-of-the-art methods across multiple benchmarks, producing highly separable logits and significantly improving both accuracy and robustness in detecting images generated by unseen models.

📝 Abstract

The rapid evolution of generative models has enabled the creation of highly realistic and diverse synthetic images, posing significant challenges to reliable and generalizable Synthetic Image Detection (SID). However, existing detectors are typically trained on limited and biased datasets, resulting in poor generalization to unseen generators. To address this issue, we propose HiMix, a unified framework that enhances generalization by expanding the training distribution and promoting artifact-aware representations. Specifically, the Mixup-driven Distributional Augmentation (MDA) module constructs continuous transitional samples between real and fake images, improving coverage of low-confidence regions and exposing the model to more challenging samples, while the pixel-wise mixup operation smoothly perturbs semantics to enhance sensitivity to low-level artifacts. Moreover, the Hierarchical Artifact-aware Representation (HAR) module aggregates artifact information from both global and local levels through cross-layer integration and coarse-to-fine feature fusion, enabling the extraction of discriminative forgery representations under diverse distributions. Extensive experiments across multiple benchmarks demonstrate that HiMix achieves state-of-the-art performance, establishing well-separated logits for improved generalization to unseen forgeries.

Problem

Research questions and friction points this paper is trying to address.

Synthetic Image Detection

Generalization

Generative Models

Artifact-aware

Unseen Generators

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixup-driven Distributional Augmentation

Hierarchical Artifact-aware Representation

Synthetic Image Detection