MambaGaze: Bidirectional Mamba with Explicit Missing Data Modeling for Cognitive Load Assessment from Eye-Gaze Tracking Data

📅 2026-05-21

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This work addresses the challenges of inaccurate cognitive load estimation in real-time eye tracking, primarily caused by frequent missing data—such as blinks and tracking failures—and inefficient modeling of long-range temporal dependencies. To overcome these limitations, we propose MambaGaze, a novel framework that explicitly models the uncertainty inherent in missing observations and irregular temporal intervals through XMD encoding, while leveraging a linear-complexity bidirectional Mamba-2 architecture to efficiently capture long-range dependencies. Evaluated on the CLARE and CL-Drive datasets, MambaGaze achieves accuracies of 76.8% and 73.1%, respectively, outperforming CNN- and Transformer-based baselines by 4–12 percentage points. Furthermore, the model supports edge deployment, delivering real-time inference at 43–68 FPS on Jetson platforms with power consumption below 7.5 W.

📝 Abstract

Real-time cognitive load assessment from eye-tracking signals could potentially enable adaptive human-centered-AI such as safety-critical applications such as driver vigilance monitoring or automated flight deck assistance, yet two challenges persist: handling frequent data missingness from blinks and tracking failures, and efficiently modeling long-range temporal dependencies. We propose MambaGaze, a framework that addresses these challenges through 1) XMD encoding, which augments raw features with observation masks and time-deltas to explicitly model data uncertainty, and 2) bidirectional Mamba-2, which captures temporal dependencies with linear computational complexity. Experiments on CLARE and CL-Drive datasets under leave-one-subject-out evaluation show that MambaGaze achieves 76.8% and 73.1% accuracy, respectively, outperforming CNN, Transformer, ResNet, and VGG baselines by 4-12 percentage points. Edge deployment benchmarks on NVIDIA Jetson platforms demonstrate real-time inference at 43-68 FPS with power consumption below 7.5W, confirming feasibility for wearable cognitive load monitoring.

Problem

Research questions and friction points this paper is trying to address.

cognitive load assessment

eye-gaze tracking

missing data

temporal dependencies

real-time inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mamba

missing data modeling

eye-gaze tracking