🤖 AI Summary
This study addresses the challenge of dynamic real-time cognitive load (CL) assessment. To this end, we introduce the first multimodal dataset specifically designed for real-time CL evaluation, comprising synchronized ECG, EDA, EEG, and eye-tracking data from 24 participants performing the MATB-II task, with dynamic self-reported CL labels annotated at 10-second intervals. We propose a leave-one-subject-out (LOSO) evaluation protocol to rigorously assess cross-subject generalizability. A custom CNN architecture is developed to jointly model temporal physiological and oculomotor signals. Under 10-fold cross-validation, the ECG+EDA+Gaze modality combination achieves the highest accuracy; under LOSO evaluation, ECG+EDA+EEG yields optimal performance. Our results demonstrate, for the first time, the feasibility of high-temporal-resolution (10-second) cross-subject real-time CL recognition. This work establishes both a benchmark multimodal dataset and a methodological framework for data-driven, real-time CL modeling.
📝 Abstract
We present a novel multimodal dataset for Cognitive Load Assessment in REaltime (CLARE). The dataset contains physiological and gaze data from 24 participants with self-reported cognitive load scores as ground-truth labels. The dataset consists of four modalities, namely, Electrocardiography (ECG), Electrodermal Activity (EDA), Electroencephalogram (EEG), and Gaze tracking. To map diverse levels of mental load on participants during experiments, each participant completed four nine-minutes sessions on a computer-based operator performance and mental workload task (the MATB-II software) with varying levels of complexity in one minute segments. During the experiment, participants reported their cognitive load every 10 seconds. For the dataset, we also provide benchmark binary classification results with machine learning and deep learning models on two different evaluation schemes, namely, 10-fold and leave-one-subject-out (LOSO) cross-validation. Benchmark results show that for 10-fold evaluation, the convolutional neural network (CNN) based deep learning model achieves the best classification performance with ECG, EDA, and Gaze. In contrast, for LOSO, the best performance is achieved by the deep learning model with ECG, EDA, and EEG.