🤖 AI Summary
Existing research on sign language emotion recognition primarily relies on isolated sentences, lacking conversational context and thus struggling to generalize to real-world interactive scenarios. This work proposes the first dialogue-level emotion recognition task for sign language, introducing eJSL Dialog—a novel dataset comprising 480 dialogues and 1,920 video samples—and systematically evaluates a range of approaches, from isolated visual models to multimodal dialogue architectures. Experiments reveal a significant performance gap when applying general-purpose multimodal models to sign language, underscoring the necessity of developing sign-language-specific, context-aware visual feature extractors. The proposed dataset and task formulation highlight their critical role in advancing research on emotion understanding in sign language communication.
📝 Abstract
Emotion Recognition in Conversation is a core component of affective computing, while current resources of sign language emotion datasets primarily focus on isolated sentences and lack conversational context. Models trained exclusively on these isolated utterances demonstrate degraded performance in real world scenarios because they cannot utilize historical dialogue flow. To address this structural limitation, we introduce the ERC task to sign language video analysis and propose the eJSL Dialog dataset. Constructed using the scripts from the STUDIES corpus, the dataset contains 1,920 video samples organized into 480 unique dialogues. We conduct systematic benchmarking on this dataset using models ranging from isolated visual networks to multimodal conversational architectures. The results reveal a domain gap when applying generic multimodal conversational emotion recognition models to sign language. These findings demonstrate the explicit need for context aware visual extractors specific to sign language and indicate that expanding the scale of conversational datasets to support large scale pre-training is a necessary next step for future research.