Privacy-Aware Detection of Fake Identity Documents: Methodology, Benchmark, and Improved Detection Methods (FakeIDet2)

📅 2025-08-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of detecting AI-generated high-fidelity counterfeit identity documents and the scarcity of real sensitive data in remote identity verification, this paper proposes a privacy-preserving forgery detection paradigm. Methodologically, we (1) introduce FakeIDet2-db—the first large-scale, publicly available benchmark comprising over 900,000 image patches of genuine and synthetic forged identity documents, covering printing, screen-display, composite physical attacks, and state-of-the-art AI-generated forgeries; (2) design a patch-based training framework that avoids using complete original documents, thereby preserving data privacy; and (3) develop a lightweight, efficient deep learning detector optimized for multi-condition acquisition on smartphones. Evaluated on a unified, standardized benchmark, our approach achieves significant improvements in detection accuracy, robustness against physical tampering and AI-generated forgeries, and cross-scenario generalization capability.

Technology Category

Application Category

📝 Abstract
Remote user verification in Internet-based applications is becoming increasingly important nowadays. A popular scenario for it consists of submitting a picture of the user's Identity Document (ID) to a service platform, authenticating its veracity, and then granting access to the requested digital service. An ID is well-suited to verify the identity of an individual, since it is government issued, unique, and nontransferable. However, with recent advances in Artificial Intelligence (AI), attackers can surpass security measures in IDs and create very realistic physical and synthetic fake IDs. Researchers are now trying to develop methods to detect an ever-growing number of these AI-based fakes that are almost indistinguishable from authentic (bona fide) IDs. In this counterattack effort, researchers are faced with an important challenge: the difficulty in using real data to train fake ID detectors. This real data scarcity for research and development is originated by the sensitive nature of these documents, which are usually kept private by the ID owners (the users) and the ID Holders (e.g., government, police, bank, etc.). The main contributions of our study are: 1) We propose and discuss a patch-based methodology to preserve privacy in fake ID detection research. 2) We provide a new public database, FakeIDet2-db, comprising over 900K real/fake ID patches extracted from 2,000 ID images, acquired using different smartphone sensors, illumination and height conditions, etc. In addition, three physical attacks are considered: print, screen, and composite. 3) We present a new privacy-aware fake ID detection method, FakeIDet2. 4) We release a standard reproducible benchmark that considers physical and synthetic attacks from popular databases in the literature.
Problem

Research questions and friction points this paper is trying to address.

Detecting AI-generated fake identity documents remotely
Overcoming scarcity of real ID data for research
Developing privacy-preserving fake ID detection methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Patch-based methodology for privacy preservation
Public database with 900K real/fake ID patches
Privacy-aware fake ID detection method FakeIDet2
🔎 Similar Papers
No similar papers found.
J
Javier Muñoz-Haro
Biometrics and Data Pattern Analytics Lab, Universidad Autónoma de Madrid, Ciudad Universitaria de Cantoblanco, Madrid, 28049, Madrid, Spain
Ruben Tolosana
Ruben Tolosana
Associate Professor, Universidad Autonoma de Madrid
Machine LearningPattern RecognitionDeepFakesBiometricsHuman-Computer Interaction
Ruben Vera-Rodriguez
Ruben Vera-Rodriguez
Associate Professor, Universidad Autonoma de Madrid
BiometricsMachine LearningHuman-Computer InteractionBehavioral BiometricsSoft Biometrics
A
Aythami Morales
Biometrics and Data Pattern Analytics Lab, Universidad Autónoma de Madrid, Ciudad Universitaria de Cantoblanco, Madrid, 28049, Madrid, Spain
J
Julian Fierrez
Biometrics and Data Pattern Analytics Lab, Universidad Autónoma de Madrid, Ciudad Universitaria de Cantoblanco, Madrid, 28049, Madrid, Spain