Predicting States of Understanding in Explanatory Interactions Using Cognitive Load-Related Linguistic Cues

📅 2026-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the real-time prediction of listeners’ comprehension states—namely, understanding, partial understanding, non-understanding, and misunderstanding—in explanatory dialogues. The authors propose a cognitive modeling approach that integrates linguistic and non-linguistic cues, uniquely combining three types of cognitive load–related features: information value, syntactic complexity, and interactive gaze behavior. Leveraging the MUNDEX corpus, they employ both statistical analysis and machine learning methods, including general-purpose classifiers and a fine-tuned German BERT-based multimodal model. Experimental results demonstrate that all three feature categories are significantly associated with comprehension states, and their integration consistently improves prediction accuracy across the four-state classification task. These findings validate the efficacy of multidimensional cognitive load indicators for fine-grained modeling of listener understanding in dialogue.

Technology Category

Application Category

📝 Abstract
We investigate how verbal and nonverbal linguistic features, exhibited by speakers and listeners in dialogue, can contribute to predicting the listener's state of understanding in explanatory interactions on a moment-by-moment basis. Specifically, we examine three linguistic cues related to cognitive load and hypothesised to correlate with listener understanding: the information value (operationalised with surprisal) and syntactic complexity of the speaker's utterances, and the variation in the listener's interactive gaze behaviour. Based on statistical analyses of the MUNDEX corpus of face-to-face dialogic board game explanations, we find that individual cues vary with the listener's level of understanding. Listener states ('Understanding', 'Partial Understanding', 'Non-Understanding' and 'Misunderstanding') were self-annotated by the listeners using a retrospective video-recall method. The results of a subsequent classification experiment, involving two off-the-shelf classifiers and a fine-tuned German BERT-based multimodal classifier, demonstrate that prediction of these four states of understanding is generally possible and improves when the three linguistic cues are considered alongside textual features.
Problem

Research questions and friction points this paper is trying to address.

state of understanding
explanatory interactions
cognitive load
linguistic cues
dialogue
Innovation

Methods, ideas, or system contributions that make the work stand out.

cognitive load
linguistic cues
state of understanding
multimodal classification
dialogue interaction
🔎 Similar Papers
No similar papers found.
Y
Yu Wang
Faculty of Linguistics and Literary Studies, Bielefeld University, Bielefeld, Germany
O
Olcay Türk
Faculty of Linguistics and Literary Studies, Bielefeld University, Bielefeld, Germany
A
Angela Grimminger
Faculty of Arts and Humanities, Paderborn University, Paderborn, Germany
Hendrik Buschmeier
Hendrik Buschmeier
Digital Linguistics Lab, Faculty of Linguistics and Literary Studies, Bielefeld University
DialogueInteractionConversational AgentsNatural Language GenerationComputational Linguistics