MambaGaze: Bidirectional Mamba with Explicit Missing Data Modeling for Cognitive Load Assessment from Eye-Gaze Tracking Data

๐Ÿ“… 2026-05-21
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

228K/year
๐Ÿค– AI Summary
This work addresses the challenges of inaccurate cognitive load estimation in real-time eye tracking, primarily caused by frequent missing dataโ€”such as blinks and tracking failuresโ€”and inefficient modeling of long-range temporal dependencies. To overcome these limitations, we propose MambaGaze, a novel framework that explicitly models the uncertainty inherent in missing observations and irregular temporal intervals through XMD encoding, while leveraging a linear-complexity bidirectional Mamba-2 architecture to efficiently capture long-range dependencies. Evaluated on the CLARE and CL-Drive datasets, MambaGaze achieves accuracies of 76.8% and 73.1%, respectively, outperforming CNN- and Transformer-based baselines by 4โ€“12 percentage points. Furthermore, the model supports edge deployment, delivering real-time inference at 43โ€“68 FPS on Jetson platforms with power consumption below 7.5 W.
๐Ÿ“ Abstract
Real-time cognitive load assessment from eye-tracking signals could potentially enable adaptive human-centered-AI such as safety-critical applications such as driver vigilance monitoring or automated flight deck assistance, yet two challenges persist: handling frequent data missingness from blinks and tracking failures, and efficiently modeling long-range temporal dependencies. We propose MambaGaze, a framework that addresses these challenges through 1) XMD encoding, which augments raw features with observation masks and time-deltas to explicitly model data uncertainty, and 2) bidirectional Mamba-2, which captures temporal dependencies with linear computational complexity. Experiments on CLARE and CL-Drive datasets under leave-one-subject-out evaluation show that MambaGaze achieves 76.8% and 73.1% accuracy, respectively, outperforming CNN, Transformer, ResNet, and VGG baselines by 4-12 percentage points. Edge deployment benchmarks on NVIDIA Jetson platforms demonstrate real-time inference at 43-68 FPS with power consumption below 7.5W, confirming feasibility for wearable cognitive load monitoring.
Problem

Research questions and friction points this paper is trying to address.

cognitive load assessment
eye-gaze tracking
missing data
temporal dependencies
real-time inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mamba
missing data modeling
eye-gaze tracking
cognitive load assessment
bidirectional state space model
๐Ÿ”Ž Similar Papers
No similar papers found.
A
Amir Mousavi
Department of Computer Science, College of AI, Cyber and Computing, The University of Texas at San Antonio
M
Mohammad Sadegh Sirjani
Department of Computer Science, College of AI, Cyber and Computing, The University of Texas at San Antonio
Erfan Nourbakhsh
Erfan Nourbakhsh
Ph.D. Student at Department of Computer Science University of Texas at San Antonio
Machine LearningNatural Language ProcessingSoftware Engineering
Mimi Xie
Mimi Xie
University of Texas at San Antonio
Embedded SystemTiny AIIntermittent SystemEnergy HarvestingIoT
Rocky Slavin
Rocky Slavin
Assistant Professor, University of Texas at San Antonio
Secure Software EngineeringPrivacyNatural Language ProcessingProgram Analysis
L
Leslie Neely
Department of Neuroscience, Developmental and Regenerative Biology, College of Sciences, The University of Texas at San Antonio
J
John Davis
Department of Educational Psychology, College of Education and Human Development, The University of Texas at San Antonio
John Quarles
John Quarles
Professor of Computer Science, University of Texas at San Antonio
Virtual RealityAugmented RealityGamesHuman-Computer Interaction