Ocular Authentication: Fusion of Gaze and Periocular Modalities

📅 2025-05-22

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This work addresses uncalibrated VR-based identity authentication. We propose the first large-scale multimodal framework that jointly models gaze trajectories and periocular images within a unified gaze estimation architecture. Our calibration-free method enables end-to-end alignment and fusion of heterogeneous ocular biometrics, simultaneously capturing temporal gaze dynamics and static periocular appearance features. Evaluated on a large-scale proprietary dataset comprising 9,202 subjects using consumer-grade VR headsets, the system demonstrates strong robustness and cross-device generalizability. Experimental results show that the multimodal approach consistently outperforms unimodal baselines across all cross-scenario evaluations; its authentication accuracy surpasses the FIDO standard and achieves state-of-the-art performance. The core contribution lies in enabling calibration-free, joint discriminative modeling of two distinct ocular biometric modalities—gaze behavior and periocular texture—within a single end-to-end trainable pipeline.

Technology Category

Application Category

📝 Abstract

This paper investigates the feasibility of fusing two eye-centric authentication modalities-eye movements and periocular images-within a calibration-free authentication system. While each modality has independently shown promise for user authentication, their combination within a unified gaze-estimation pipeline has not been thoroughly explored at scale. In this report, we propose a multimodal authentication system and evaluate it using a large-scale in-house dataset comprising 9202 subjects with an eye tracking (ET) signal quality equivalent to a consumer-facing virtual reality (VR) device. Our results show that the multimodal approach consistently outperforms both unimodal systems across all scenarios, surpassing the FIDO benchmark. The integration of a state-of-the-art machine learning architecture contributed significantly to the overall authentication performance at scale, driven by the model's ability to capture authentication representations and the complementary discriminative characteristics of the fused modalities.

Problem

Research questions and friction points this paper is trying to address.

Fusing gaze and periocular modalities for authentication

Evaluating multimodal system on large-scale dataset

Improving authentication performance beyond FIDO benchmark

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fuses gaze and periocular for authentication

Uses large-scale VR-quality eye tracking data

Leverages advanced ML for performance boost

🔎 Similar Papers

EyeTrAES: Fine-grained, Low-Latency Eye Tracking via Adaptive Event Slicing