🤖 AI Summary
Addressing the trade-off between accuracy and efficiency in single-image morphing attack detection (S-MAD), this paper proposes a knowledge distillation–driven lightweight and efficient detection framework. An EfficientNet-based teacher model guides a Vision Transformer (ViT) student model to learn discriminative features, while Low-Rank Adaptation (LoRA) is introduced for the first time into S-MAD detection to enable parameter-efficient fine-tuning of ViT. This design substantially reduces computational overhead while improving generalization and robustness. Experiments on a multi-source synthetic image dataset—comprising images generated by ten state-of-the-art generative algorithms—demonstrate that the proposed method outperforms six SOTA approaches in detection accuracy, achieves a 37% speedup in inference latency, and reduces model parameters by 62%. The framework thus achieves an optimal balance among high accuracy, high efficiency, and strong generalization capability.
📝 Abstract
Face Recognition Systems (FRS) are critical for security but remain vulnerable to morphing attacks, where synthetic images blend biometric features from multiple individuals. We propose a novel Single-Image Morphing Attack Detection (S-MAD) approach using a teacher-student framework, where a CNN-based teacher model refines a ViT-based student model. To improve efficiency, we integrate Low-Rank Adaptation (LoRA) for fine-tuning, reducing computational costs while maintaining high detection accuracy. Extensive experiments are conducted on a morphing dataset built from three publicly available face datasets, incorporating ten different morphing generation algorithms to assess robustness. The proposed method is benchmarked against six state-of-the-art S-MAD techniques, demonstrating superior detection performance and computational efficiency.