NeXT-IMDL: Build Benchmark for NeXT-Generation Image Manipulation Detection & Localization

📅 2025-12-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing image manipulation detection and localization (IMDL) methods exhibit overestimated generalization under cross-dataset evaluation and fail to robustly detect diverse AIGC-generated forgeries encountered in real-world scenarios. Method: We introduce the first diagnostic IMDL benchmark, featuring a novel four-dimensional taxonomy—editing model, manipulation type, semantic content, and forgery granularity—and five cross-dimensional evaluation protocols. Built upon a large-scale, multi-source, structurally annotated AIGC manipulation dataset, it establishes a rigorously controlled cross-evaluation framework. Contribution/Results: Evaluating 11 state-of-the-art models reveals severe robustness degradation: average detection F1 drops by 32.7% and localization IoU by 41.5%. This benchmark dispels performance illusions, providing a reproducible, attributable, and mechanistically insightful evaluation paradigm for advancing IMDL generalization research.

Technology Category

Application Category

📝 Abstract
The accessibility surge and abuse risks of user-friendly image editing models have created an urgent need for generalizable, up-to-date methods for Image Manipulation Detection and Localization (IMDL). Current IMDL research typically uses cross-dataset evaluation, where models trained on one benchmark are tested on others. However, this simplified evaluation approach conceals the fragility of existing methods when handling diverse AI-generated content, leading to misleading impressions of progress. This paper challenges this illusion by proposing NeXT-IMDL, a large-scale diagnostic benchmark designed not just to collect data, but to probe the generalization boundaries of current detectors systematically. Specifically, NeXT-IMDL categorizes AIGC-based manipulations along four fundamental axes: editing models, manipulation types, content semantics, and forgery granularity. Built upon this, NeXT-IMDL implements five rigorous cross-dimension evaluation protocols. Our extensive experiments on 11 representative models reveal a critical insight: while these models perform well in their original settings, they exhibit systemic failures and significant performance degradation when evaluated under our designed protocols that simulate real-world, various generalization scenarios. By providing this diagnostic toolkit and the new findings, we aim to advance the development towards building truly robust, next-generation IMDL models.
Problem

Research questions and friction points this paper is trying to address.

Proposes a diagnostic benchmark for image manipulation detection
Challenges existing methods' generalization on diverse AI-generated content
Aims to advance robust next-generation detection models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale diagnostic benchmark for image manipulation detection
Categorizes manipulations along four fundamental axes
Implements five rigorous cross-dimension evaluation protocols
🔎 Similar Papers
No similar papers found.
Y
Yifei Li
Department of Automation, Tsinghua University
H
Haoyuan He
Department of Automation, Tsinghua University
Y
Yu Zheng
Department of Automation, Tsinghua University
B
Bingyao Yu
Department of Automation, Tsinghua University
Wenzhao Zheng
Wenzhao Zheng
EECS, University of California, Berkeley
Large ModelsEmbodied AgentsAutonomous Driving
L
Lei Chen
Department of Automation, Tsinghua University
J
Jie Zhou
Department of Automation, Tsinghua University
J
Jiwen Lu
Department of Automation, Tsinghua University