🤖 AI Summary
Generative AI has intensified threats from digital identity document (ID) forgery, yet existing benchmarks lack realism and legal safety for KYC scenarios. Method: We introduce ID-Forgery, the first public benchmark dataset for ID forgery detection tailored to KYC. It innovatively combines real human faces with multilingual, multi-style ID templates; simulates realistic acquisition distortions via physical printing followed by multi-device re-capture; and applies digital attacks—including content injection and localized tampering—using mainstream generative models. All samples exclude synthetic faces and watermarks, ensuring commercial usability and zero legal risk. Contribution/Results: Extensive evaluation reveals severe limitations of state-of-the-art detectors: TruFor, MMFusion, UniFD, and FatFormer achieve a 48.7% false-negative rate under a 10% false-positive constraint. ID-Forgery thus establishes a more realistic, challenging, and practically meaningful benchmark for advancing ID forgery detection research.
📝 Abstract
Advancements in image generation led to the availability of easy-to-use tools for malicious actors to create forged images. These tools pose a serious threat to the widespread Know Your Customer (KYC) applications, requiring robust systems for detection of the forged Identity Documents (IDs). To facilitate the development of the detection algorithms, in this paper, we propose a novel publicly available (including commercial use) dataset, FantasyID, which mimics real-world IDs but without tampering with legal documents and, compared to previous public datasets, it does not contain generated faces or specimen watermarks. FantasyID contains ID cards with diverse design styles, languages, and faces of real people. To simulate a realistic KYC scenario, the cards from FantasyID were printed and captured with three different devices, constituting the bonafide class. We have emulated digital forgery/injection attacks that could be performed by a malicious actor to tamper the IDs using the existing generative tools. The current state-of-the-art forgery detection algorithms, such as TruFor, MMFusion, UniFD, and FatFormer, are challenged by FantasyID dataset. It especially evident, in the evaluation conditions close to practical, with the operational threshold set on validation set so that false positive rate is at 10%, leading to false negative rates close to 50% across the board on the test set. The evaluation experiments demonstrate that FantasyID dataset is complex enough to be used as an evaluation benchmark for detection algorithms.