ForensicHub: A Unified Benchmark&Codebase for All-Domain Fake Image Detection and Localization

📅 2025-05-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The field of fake image detection and localization (FIDL) suffers from severe fragmentation: deepfakes, image manipulation, AI-generated content (AIGC), and document forgery each employ disjoint datasets, models, and evaluation protocols—hindering interoperability, cross-domain comparison, and reproducibility. To address this, we introduce FIDL-Bench, the first unified benchmark and open-source codebase covering all four tasks. Our approach features: (1) a modular, configuration-driven forensic pipeline architecture; (2) an adapter mechanism unifying heterogeneous data formats and evaluation interfaces; (3) two newly curated benchmarks (AIGC and Document) integrating 12 datasets across the four domains; and (4) support for six backbone architectures, ten baseline models, and standardized cross-domain evaluation protocols. Comprehensive benchmarking yields eight key insights, substantially enhancing reproducibility, comparability, and collaborative advancement in FIDL research.

Technology Category

Application Category

📝 Abstract
The field of Fake Image Detection and Localization (FIDL) is highly fragmented, encompassing four domains: deepfake detection (Deepfake), image manipulation detection and localization (IMDL), artificial intelligence-generated image detection (AIGC), and document image manipulation localization (Doc). Although individual benchmarks exist in some domains, a unified benchmark for all domains in FIDL remains blank. The absence of a unified benchmark results in significant domain silos, where each domain independently constructs its datasets, models, and evaluation protocols without interoperability, preventing cross-domain comparisons and hindering the development of the entire FIDL field. To close the domain silo barrier, we propose ForensicHub, the first unified benchmark&codebase for all-domain fake image detection and localization. Considering drastic variations on dataset, model, and evaluation configurations across all domains, as well as the scarcity of open-sourced baseline models and the lack of individual benchmarks in some domains, ForensicHub: i) proposes a modular and configuration-driven architecture that decomposes forensic pipelines into interchangeable components across datasets, transforms, models, and evaluators, allowing flexible composition across all domains; ii) fully implements 10 baseline models, 6 backbones, 2 new benchmarks for AIGC and Doc, and integrates 2 existing benchmarks of DeepfakeBench and IMDLBenCo through an adapter-based design; iii) conducts indepth analysis based on the ForensicHub, offering 8 key actionable insights into FIDL model architecture, dataset characteristics, and evaluation standards. ForensicHub represents a significant leap forward in breaking the domain silos in the FIDL field and inspiring future breakthroughs.
Problem

Research questions and friction points this paper is trying to address.

Lack of unified benchmark for all-domain fake image detection
Domain silos hinder cross-domain comparisons in FIDL
Absence of open-sourced baselines and benchmarks in some domains
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular architecture for flexible forensic pipelines
Implements 10 baseline models and 6 backbones
Adapter-based design integrates existing benchmarks