🤖 AI Summary
To address the severe scarcity of bona fide samples in Presentation Attack Detection (PAD), which critically undermines model robustness, this paper proposes— for the first time—the use of Stable Diffusion to generate high-fidelity synthetic ID card images, systematically augmenting bona fide training data. Unlike prior works focusing on attack sample generation, our method leverages controllable text guidance and structural priors of identity documents to synthesize diverse images exhibiting consistent texture, illumination, and geometric fidelity. The generated samples are reliably classified as bona fide by mainstream PAD models. Under limited real-data regimes, our approach significantly improves detection performance—reducing Equal Error Rate (EER) by an average of 32.7%—and generalizes effectively to unseen attack types. Experiments demonstrate that this diffusion-based data augmentation paradigm effectively alleviates the bona fide data bottleneck, offering a novel solution for low-resource PAD tasks.
📝 Abstract
Nowadays, the development of a Presentation Attack Detection (PAD) system for ID cards presents a challenge due to the lack of images available to train a robust PAD system and the increase in diversity of possible attack instrument species. Today, most algorithms focus on generating attack samples and do not take into account the limited number of bona fide images. This work is one of the first to propose a method for mimicking bona fide images by generating synthetic versions of them using Stable Diffusion, which may help improve the generalisation capabilities of the detector. Furthermore, the new images generated are evaluated in a system trained from scratch and in a commercial solution. The PAD system yields an interesting result, as it identifies our images as bona fide, which has a positive impact on detection performance and data restrictions.