Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach

📅 2025-04-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing AI-generated image detection datasets focus predominantly on object-level manipulations, neglecting scene-level edits—such as sky or ground modifications—limiting generalizability. To address this, we introduce BR-Gen, the first large-scale (150K samples), scene-region-oriented dataset for localized AI forgery detection. We further propose NFA-ViT, a noise-guided forgery enhancement Vision Transformer. Its key contributions are: (1) a novel scene-aware fine-grained annotation paradigm; (2) a dual-component mechanism integrating noise fingerprint localization with multi-region attention-based feature interaction to enable global propagation of forensic cues; and (3) semantic-calibrated labeling coupled with an automated perception-generation-evaluation pipeline. Experiments demonstrate that NFA-ViT achieves significant performance gains over state-of-the-art methods on BR-Gen and exhibits strong cross-dataset generalization across multiple established benchmarks.

Technology Category

Application Category

📝 Abstract
The rise of AI-generated image editing tools has made localized forgeries increasingly realistic, posing challenges for visual content integrity. Although recent efforts have explored localized AIGC detection, existing datasets predominantly focus on object-level forgeries while overlooking broader scene edits in regions such as sky or ground. To address these limitations, we introduce extbf{BR-Gen}, a large-scale dataset of 150,000 locally forged images with diverse scene-aware annotations, which are based on semantic calibration to ensure high-quality samples. BR-Gen is constructed through a fully automated Perception-Creation-Evaluation pipeline to ensure semantic coherence and visual realism. In addition, we further propose extbf{NFA-ViT}, a Noise-guided Forgery Amplification Vision Transformer that enhances the detection of localized forgeries by amplifying forgery-related features across the entire image. NFA-ViT mines heterogeneous regions in images, emph{i.e.}, potential edited areas, by noise fingerprints. Subsequently, attention mechanism is introduced to compel the interaction between normal and abnormal features, thereby propagating the generalization traces throughout the entire image, allowing subtle forgeries to influence a broader context and improving overall detection robustness. Extensive experiments demonstrate that BR-Gen constructs entirely new scenarios that are not covered by existing methods. Take a step further, NFA-ViT outperforms existing methods on BR-Gen and generalizes well across current benchmarks. All data and codes are available at https://github.com/clpbc/BR-Gen.
Problem

Research questions and friction points this paper is trying to address.

Detecting localized AI-generated image forgeries in diverse scenes
Addressing lack of datasets for scene-level edits like sky or ground
Improving detection robustness with noise-guided forgery amplification
Innovation

Methods, ideas, or system contributions that make the work stand out.

BR-Gen dataset with 150,000 locally forged images
Automated Perception-Creation-Evaluation pipeline for realism
NFA-ViT transformer amplifies forgery features via noise
🔎 Similar Papers
No similar papers found.
L
Lvpan Cai
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China
H
Haowei Wang
Youtu Lab, Tencent, Shanghai, P.R. China
Jiayi Ji
Jiayi Ji
Rutgers University
Y
YanShu ZhouMen
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China
Yiwei Ma
Yiwei Ma
Stevens Institute of Technology
X
Xiaoshuai Sun
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China
L
Liujuan Cao
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China
R
Rongrong Ji
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China