Selfie-Capture Dynamics as an Auxiliary Signal Against Deepfakes and Injection Attacks for Mobile Identity Verification

📅 2026-04-30
📈 Citations: 0
Influential: 0
📄 PDF

career value

227K/year
🤖 AI Summary
This study addresses the vulnerability of mobile remote authentication to presentation attacks, deepfakes, and video injection threats. The authors propose leveraging passively collected multi-sensor motion trajectories during selfie capture as auxiliary biometric cues, integrating multivariate time series classification methods—such as QUANT+3-NN and WEASEL+MUSE—with anomaly detection to simultaneously screen for attacks and verify identity. The work presents the first systematic validation of selfie-induced dynamic signals for defending against deepfake and injection attacks, establishes a realistic multimodal evaluation framework, and highlights the discrepancy between closed-set classification accuracy and verification performance. Experimental results demonstrate that a unimodal accelerometer alone achieves 0% false rejection rate, while under a 9-channel configuration, methods like WEASEL+MUSE attain error rates as low as 1.07%.
📝 Abstract
Mobile remote identity verification (RIdV) systems are exposed to attacks that manipulate or replace the facial video stream, including presentation attacks, real-time deepfakes, and video injection. Recent European requirements, including ETSI TS 119 461 and CEN/TS 18099, motivate complementary evidence channels beyond camera-based presentation-attack detection. This paper investigates whether passive motion traces recorded during selfie capture provide auxiliary evidence for spoof screening and user verification. We introduce CanSelfie, a dataset of 375 bona fide multi-sensor sequences collected at 50\,Hz from 30 participants using a commercial mobile RIdV application, together with stationary, handheld, and temporally shifted attack-proxy scenarios. We benchmark 7 multivariate time-series classifiers and 8 whole-series anomaly detectors across sensor configurations and temporal windows. For spoof screening, accelerometer-only ROCKAD obtains 0.00\% false rejection rate (FRR) and 43.8\% false acceptance rate (FAR), while QUANT+3-NN obtains the lowest overall FAR of 32.0\% at 2.37\% FRR; both reject all stationary attack proxies. For same-device and same-session user verification, WEASEL+MUSE reaches 1.07\% equal error rate (EER) using 9 sensor channels. The analysis shows that raw accelerometer data, preserving gravity and orientation cues, is the most informative modality, and that closed-set classification accuracy alone does not imply good verification performance because threshold calibration depends on score distributions. The findings suggest that short selfie-capture motion traces contain measurable spoof-related and identity-related information, supporting their use as a low-friction auxiliary signal while also identifying the need for cross-device, cross-session, and real injection-attack evaluation.
Problem

Research questions and friction points this paper is trying to address.

deepfakes
injection attacks
mobile identity verification
presentation attacks
spoof detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

selfie-capture dynamics
deepfake detection
mobile identity verification
multivariate time-series classification
accelerometer-based spoof screening
🔎 Similar Papers
No similar papers found.