🤖 AI Summary
The field of fake image detection and localization (FIDL) suffers from severe fragmentation: deepfakes, image manipulation, AI-generated content (AIGC), and document forgery each employ disjoint datasets, models, and evaluation protocols—hindering interoperability, cross-domain comparison, and reproducibility. To address this, we introduce FIDL-Bench, the first unified benchmark and open-source codebase covering all four tasks. Our approach features: (1) a modular, configuration-driven forensic pipeline architecture; (2) an adapter mechanism unifying heterogeneous data formats and evaluation interfaces; (3) two newly curated benchmarks (AIGC and Document) integrating 12 datasets across the four domains; and (4) support for six backbone architectures, ten baseline models, and standardized cross-domain evaluation protocols. Comprehensive benchmarking yields eight key insights, substantially enhancing reproducibility, comparability, and collaborative advancement in FIDL research.
📝 Abstract
The field of Fake Image Detection and Localization (FIDL) is highly fragmented, encompassing four domains: deepfake detection (Deepfake), image manipulation detection and localization (IMDL), artificial intelligence-generated image detection (AIGC), and document image manipulation localization (Doc). Although individual benchmarks exist in some domains, a unified benchmark for all domains in FIDL remains blank. The absence of a unified benchmark results in significant domain silos, where each domain independently constructs its datasets, models, and evaluation protocols without interoperability, preventing cross-domain comparisons and hindering the development of the entire FIDL field. To close the domain silo barrier, we propose ForensicHub, the first unified benchmark&codebase for all-domain fake image detection and localization. Considering drastic variations on dataset, model, and evaluation configurations across all domains, as well as the scarcity of open-sourced baseline models and the lack of individual benchmarks in some domains, ForensicHub: i) proposes a modular and configuration-driven architecture that decomposes forensic pipelines into interchangeable components across datasets, transforms, models, and evaluators, allowing flexible composition across all domains; ii) fully implements 10 baseline models, 6 backbones, 2 new benchmarks for AIGC and Doc, and integrates 2 existing benchmarks of DeepfakeBench and IMDLBenCo through an adapter-based design; iii) conducts indepth analysis based on the ForensicHub, offering 8 key actionable insights into FIDL model architecture, dataset characteristics, and evaluation standards. ForensicHub represents a significant leap forward in breaking the domain silos in the FIDL field and inspiring future breakthroughs.