🤖 AI Summary
This study addresses the limitations of conventional real-time captions, which lack emotional and nonverbal cues and thereby exacerbate cognitive load and impair comprehension for deaf or hard-of-hearing (DHH) individuals and neurodiverse learners—such as those with ADHD—in STEM education. It presents the first systematic exploration of integrating customizable affective and multimodal cues—including facial expressions, gestures, keyword highlighting, and emojis—into adaptive captioning systems. The authors designed and evaluated four prototype interfaces that combine multimodal sensing, affect visualization, and user-configurable settings. Findings demonstrate that this approach significantly reduces subjective cognitive load and enhances comprehension performance, underscoring the critical role of personalized captioning in advancing inclusive educational technologies.
📝 Abstract
Real-time captioning is vital for Deaf and Hard of Hearing (DHH) and neurodivergent learners (e.g., those with ADHD), yet it often omits emotional and non-verbal cues essential for comprehension. This omission is particularly consequential in STEM education, where cognitively demanding material can exacerbate the challenges faced by caption users across diverse ability profiles. In this paper, we present a design-oriented exploration of four captioning prototypes that embed emotional and multimodal cues, including facial expressions, body gestures, keyword highlighting, and emoji. Across a pilot and a main study with 24 participants, we found that certain prototypes reduced self-reported cognitive load and improved comprehension scores compared to traditional captions. Qualitative feedback reveals the importance of customizable caption features to accommodate neurodivergent users' preferences (e.g., ADHD or different levels of comfort with emojis). Our findings contribute to ongoing conversations in accessible technology research about how best to integrate emotional cues into captions in a way that is both usable and beneficial for a wide range of learners.