Steering Vision-Language Pre-trained Models for Incremental Face Presentation Attack Detection

📅 2025-12-21

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

In privacy-sensitive scenarios, face presentation attack detection (PAD) must continually adapt to emerging spoofing strategies and cross-domain distribution shifts, yet cannot retain historical data due to privacy constraints. To address this challenge, we propose SVLP-IL—a replay-free incremental learning framework built upon vision-language pre-trained (VLP) models. SVLP-IL introduces a novel multi-aspect prompting (MAP) mechanism to achieve task-adaptive semantic alignment across incremental PAD tasks, and integrates selective elastic weight consolidation (SEWC) to dynamically safeguard critical parameters, thereby balancing knowledge stability and plasticity. Evaluated on multiple PAD benchmarks, SVLP-IL significantly mitigates catastrophic forgetting and achieves an average 8.3% improvement in cross-domain detection accuracy. The framework enables long-term, privacy-compliant, and robust liveness deployment without requiring access to past training data.

Technology Category

Application Category

📝 Abstract

Face Presentation Attack Detection (PAD) demands incremental learning (IL) to combat evolving spoofing tactics and domains. Privacy regulations, however, forbid retaining past data, necessitating rehearsal-free IL (RF-IL). Vision-Language Pre-trained (VLP) models, with their prompt-tunable cross-modal representations, enable efficient adaptation to new spoofing styles and domains. Capitalizing on this strength, we propose extbf{SVLP-IL}, a VLP-based RF-IL framework that balances stability and plasticity via extit{Multi-Aspect Prompting} (MAP) and extit{Selective Elastic Weight Consolidation} (SEWC). MAP isolates domain dependencies, enhances distribution-shift sensitivity, and mitigates forgetting by jointly exploiting universal and domain-specific cues. SEWC selectively preserves critical weights from previous tasks, retaining essential knowledge while allowing flexibility for new adaptations. Comprehensive experiments across multiple PAD benchmarks show that SVLP-IL significantly reduces catastrophic forgetting and enhances performance on unseen domains. SVLP-IL offers a privacy-compliant, practical solution for robust lifelong PAD deployment in RF-IL settings.

Problem

Research questions and friction points this paper is trying to address.

Incremental learning for evolving face spoofing detection

Privacy-compliant rehearsal-free adaptation to new domains

Balancing stability and plasticity in vision-language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Aspect Prompting isolates domain dependencies and cues

Selective Elastic Weight Consolidation preserves critical previous task weights

Vision-Language Pre-trained models enable adaptation via prompt tuning

🔎 Similar Papers

Standing on the Shoulders of Giants: Reprogramming Visual-Language Model for General Deepfake Detection