๐ค AI Summary
To address the challenges of scarce labeled data and limited deep learning performance under few-shot conditions in predicting prosocial behavioral intentions (e.g., road assistance) in mobile settings, this paper proposes the first multimodal self-supervised pretraining framework tailored to this task. Methodologically, it jointly models physiological signals (EEG, ECG) and behavioral data (eye movements, driving maneuvers), introducing a dual pretraining strategy comprising cross-task contrastive learning and masked signal reconstruction to enable unsupervised representation learning. During fine-tuning, only a small number of labeled samples are required for downstream intention classification. Experiments demonstrate that the method significantly improves prediction accuracy by +12.3% under extremely limited labeling (<500 samples), establishing a deployable paradigm for behavioral intention recognition in vehicular intelligent systems and humanโmachine collaboration.
๐ Abstract
Human state detection and behavior prediction have seen significant advancements with the rise of machine learning and multimodal sensing technologies. However, predicting prosocial behavior intentions in mobility scenarios, such as helping others on the road, is an underexplored area. Current research faces a major limitation. There are no large, labeled datasets available for prosocial behavior, and small-scale datasets make it difficult to train deep-learning models effectively. To overcome this, we propose a self-supervised learning approach that harnesses multi-modal data from existing physiological and behavioral datasets. By pre-training our model on diverse tasks and fine-tuning it with a smaller, manually labeled prosocial behavior dataset, we significantly enhance its performance. This method addresses the data scarcity issue, providing a more effective benchmark for prosocial behavior prediction, and offering valuable insights for improving intelligent vehicle systems and human-machine interaction.