π€ AI Summary
To address the growing threat of maliciously manipulated images on social media platforms for opinion manipulation, this paper introduces DF2023βthe first large-scale, open-source benchmark dataset featuring fine-grained annotations across four major image forgery types: splicing, copy-move, enhancement, and object removal. Comprising over one million samples derived from real-world social media propagation scenarios, DF2023 is constructed via multi-source acquisition and rigorous human annotation, enabling comprehensive evaluation of both forgery localization and classification. It establishes the first systematic unification of these four forgery categories, significantly lowering data barriers for algorithm development and facilitating fair, cross-model benchmarking. Third-party reproductions based on DF2023 demonstrate consistent improvements of 12β18% in cross-category generalization performance across multiple detection models. Thus, DF2023 provides a reproducible, highly compatible, and strongly generalizable benchmark for digital image forensics research.
π Abstract
The deliberate manipulation of public opinion, especially through altered images, which are frequently disseminated through online social networks, poses a significant danger to society. To fight this issue on a technical level we support the research community by releasing the Digital Forensics 2023 (DF2023) training and validation dataset, comprising one million images from four major forgery categories: splicing, copy-move, enhancement and removal. This dataset enables an objective comparison of network architectures and can significantly reduce the time and effort of researchers preparing datasets.