π€ AI Summary
Existing vision foundation models exhibit limited generalization in open-set iris presentation attack detection (PAD), particularly suffering significant performance degradation when confronted with unknown attack types, cross-sensor conditions, and cross-spectral shifts (e.g., from near-infrared to visible light). This work presents the first systematic evaluation of five general-purpose vision foundation models under three open-set protocols encompassing unseen attack instruments, cross-dataset transfer, and cross-spectral generalization, comparing strategies including frozen feature extraction, LoRA fine-tuning, and full backbone fine-tuning. The study reveals that performance in closed-set or cross-dataset settings fails to reflect true security robustness; while LoRA proves beneficial in certain cross-dataset scenarios, it exacerbates failure under attack-type and spectral distribution shifts, highlighting the current modelsβ vulnerability to attack diversity and spectral variation.
π Abstract
Vision foundation models have demonstrated strong transferability across diverse visual recognition tasks and are increasingly considered for biometric applications. Their suitability for iris Presentation Attack Detection (PAD), particularly under realistic open-set operating conditions, remains insufficiently examined. This work presents a systematic failure analysis of general-purpose vision foundation models for open-set iris PAD using periocular imagery. Five representative foundation models are evaluated under three open-set protocols that explicitly separate different sources of distribution shift: unseen Presentation Attack Instruments (PAIs), unseen datasets captured with different sensors and cross-spectral transfer from near-infrared (NIR) to visible spectrum (VIS) imagery. Both frozen feature representations and parameter-efficient task adaptation using Low-Rank Adaptation (LoRA) are assessed within a unified experimental framework. The results indicate that foundation models can transfer across datasets with similar sensing characteristics, but fail to generalise reliably to unseen attack instruments and degrade sharply under cross-spectral evaluation. While LoRA improves performance in certain cross-dataset settings, it frequently amplifies failure under attack-level and spectral shifts. Additional validation experiments using segmented iris inputs, full backbone fine-tuning, joint cross-dataset and cross-PAI shifts, and reverse VIS to NIR transfer further confirm that these failures are not simply artefacts of periocular input, weak adaptation, or one-directional spectral evaluation. These findings show that strong closed-set or cross-dataset performance should not be treated as evidence of robust open-set security, and highlight the need for PAD representations that maintain sensitivity to presentation artefacts while remaining stable under realistic deployment variation.