🤖 AI Summary
This work addresses the critical threat posed by highly realistic disguise makeup attacks—employing advanced cosmetics, prosthetics, and other techniques—to face recognition systems. To counter this challenge, the authors propose a two-stage detection framework: first, a style-invariant full-face model integrated with metric learning and whitening transformation generates region-wise attention scores via Grad-CAM; subsequently, these attention maps guide the extraction of local image patches, which are analyzed by region-specific subnetworks for fine-grained attack discrimination. The study introduces the first diverse, real-world disguised makeup face dataset and an attention-guided patch-based detection mechanism that significantly enhances detection accuracy while preserving generalization. On their in-house dataset, the method achieves 8.97% ACER and 9.76% EER; on SIW-Mv2, it attains 0% ACER for Obfuscation and Impersonation attacks and 1.34% ACER for Cosmetics attacks, outperforming existing approaches.
📝 Abstract
Despite significant advances in facial recognition systems, they remain vulnerable to face presentation attacks. Among them, disguise makeup attacks are particularly challenging, as they use advanced cosmetics, prosthetic components, and artificial materials to realistically alter facial appearance, often making detection difficult even for humans. Despite their importance, this problem remains underexplored, and publicly available datasets are limited. To address this, we propose a generalized disguise makeup presentation attack detection framework. The method adopts a two-phase design in which a style-invariant full-face model, trained with metric learning and enhanced by a whitening transformation, extracts region attention scores via Grad-CAM. These scores guide a patch-based phase that performs localized analysis using region-specific subnetworks trained with metric learning for fine-grained discrimination. We also construct a new, diverse dataset of live and disguise makeup faces collected under real-world conditions, covering variations in subjects, environments, and disguise materials. Experimental results demonstrate strong generalization across both the collected dataset and SIW-Mv2, achieving 8.97% ACER and 9.76% EER on the collected dataset, and 0% ACER on Obfuscation and Impersonation and 1.34% on Cosmetics attacks of SIW-Mv2. The proposed method consistently outperforms prior works while maintaining robust performance across other spoof types.